Classification Accuracy Evaluation

This is a flow illustrating how to evaluate the performance of a classification system. It involves comparing each prediction to the groundtruth and assigns a "Correct" or "Incorrect" grade, and aggregating the results to produce metrics such as accuracy, which reflects how good the system is at classifying the data.

Tools used in this flow：

python tool

What you will learn

In this flow, you will learn

how to compose a point based evaluation flow, where you can calculate point-wise metrics.
the way to log metrics. use from promptflow import log_metric
- see file calculate_accuracy.py

0. Setup connection

Prepare your Azure Open AI resource follow this instruction and get your api_key if you don't have one.

# Override keys with --set to avoid yaml file changes
pf connection create --file ../../../connections/azure_openai.yml --set api_key=<your_api_key> api_base=<your_api_base>

1. Test flow/node

# test with default input value in flow.dag.yaml
pf flow test --flow .

# test with flow inputs
pf flow test --flow . --inputs groundtruth=APP prediction=APP

# test node with inputs
pf flow test --flow . --node grade --inputs groundtruth=groundtruth prediction=prediction

2. create flow run with multi line data

There are two ways to evaluate an classification flow.

pf run create --flow . --data ./data.jsonl --stream

3. create run against other flow run

Learn more in web-classification

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Classification Accuracy Evaluation

What you will learn

0. Setup connection

1. Test flow/node

2. create flow run with multi line data

3. create run against other flow run

Files

README.md

Latest commit

History

README.md

File metadata and controls

Classification Accuracy Evaluation

What you will learn

0. Setup connection

1. Test flow/node

2. create flow run with multi line data

3. create run against other flow run