Skip to main content

Running Evaluations

Starting a new evaluation

  1. Go to the Evaluations page
  2. Select the Human annotation tab
  3. Click Start new evaluation

Configuring your evaluation

  1. Select your test set - Choose the data you want to evaluate against
  2. Select your revision - Pick the version of your application to test
warning

Your test set columns must match the input variables in your revision. If they don't match, you'll see an error message.

  1. Choose evaluators - Select how you want to measure performance

Running the evaluation

After configuring:

  1. Click Start evaluation
  2. You'll be redirected to the annotation interface
  3. Click Run all to generate outputs and begin evaluation

Annotating responses

For each test case:

  1. Review the input and output
  2. Use the evaluation form on the right to score the response
  3. Click Annotate to save your assessment
  4. Click Next to move to the next test case
tip

Select the Unannotated tab to see only the test cases you haven't reviewed yet.

Collaboration

You can invite team members to help with evaluation by sharing the evaluation link. Team members must be added to your workspace first.

Next steps