Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support basic text output in tests #1113

Merged
merged 2 commits into from
Feb 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions docs/src/content/docs/reference/scripts/tests.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -156,13 +156,15 @@

#### transform

By default, the `asserts` are executed on the raw LLM output.
However, you can use a javascript expression to select a part of the output to test.
By default, GenAIScript extracts the `text` field from the output before sending it to PromptFoo.

Check warning on line 159 in docs/src/content/docs/reference/scripts/tests.mdx

View workflow job for this annotation

GitHub Actions / build

The sentence structure could be improved for clarity. Consider rephrasing to avoid repetition and enhance readability.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence structure could be improved for clarity. Consider rephrasing to avoid repetition and enhance readability.

AI-generated content by pr-docs-review-commit style may be incorrect

You can disable this mode by setting `format: "json"`; then the the `asserts` are executed on the raw LLM output.
You can use a javascript expression to select a part of the output to test.

```js title="proofreader.genai.js" wrap "transform"
scripts({
tests: {
files: "src/will-trigger.cancel.txt",
format: "json",
asserts: {
type: "equals",
value: "cancelled",
Expand Down
84 changes: 49 additions & 35 deletions packages/core/src/test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ import {
import { arrayify, logWarn } from "./util"
import { runtimeHost } from "./host"
import { ModelConnectionInfo, parseModelIdentifier } from "./models"
import { deleteUndefinedValues } from "./cleaners"
import { deleteEmptyValues, deleteUndefinedValues } from "./cleaners"
import testSchema from "../../../docs/public/schemas/tests.json"
import { validateJSONWithSchema } from "./schema"
import { TraceOptions } from "./trace"
Expand Down Expand Up @@ -175,7 +175,6 @@ export async function generatePromptFooConfiguration(
}

const cli = options?.cli
const transform = "output.text"

const resolveModel = (m: string) => runtimeHost.modelAliases[m]?.model ?? m

Expand All @@ -186,6 +185,14 @@ export async function generatePromptFooConfiguration(
const defaultTest = deleteUndefinedValues({
options: deleteUndefinedValues({ provider: testProvider }),
})
const testTransforms = {
text: "output.text",
json: undefined as string,
}
const assertTransforms = {
text: undefined as string,
json: "output.text",
}

// Create configuration object
const res = {
Expand Down Expand Up @@ -238,42 +245,49 @@ export async function generatePromptFooConfiguration(
vars,
rubrics,
facts,
format = "text",
keywords = [],
forbidden = [],
asserts = [],
}) => ({
description,
vars: deleteUndefinedValues({
files,
vars,
}),
assert: [
...arrayify(keywords).map((kv) => ({
type: "icontains", // Check if output contains keyword
value: kv,
transform,
})),
...arrayify(forbidden).map((kv) => ({
type: "not-icontains", // Check if output does not contain forbidden keyword
value: kv,
transform,
})),
...arrayify(rubrics).map((value) => ({
type: "llm-rubric", // Use LLM rubric for evaluation
value,
transform,
})),
...arrayify(facts).map((value) => ({
type: "factuality", // Check factuality of output
value,
transform,
})),
...arrayify(asserts).map((assert) => ({
...assert,
transform: assert.transform || transform, // Default transform
})),
].filter((a) => !!a), // Filter out any undefined assertions
})
}) =>
deleteEmptyValues({
description,
vars: deleteEmptyValues({
files,
workspaceFiles,
vars,
}),
options: {
transform: testTransforms[format],
},
assert: [
...arrayify(keywords).map((kv) => ({
type: "icontains", // Check if output contains keyword
value: kv,
transform: assertTransforms[format],
})),
...arrayify(forbidden).map((kv) => ({
type: "not-icontains", // Check if output does not contain forbidden keyword
value: kv,
transform: assertTransforms[format],
})),
...arrayify(rubrics).map((value) => ({
type: "llm-rubric", // Use LLM rubric for evaluation
value,
transform: assertTransforms[format],
})),
...arrayify(facts).map((value) => ({
type: "factuality", // Check factuality of output
value,
transform: assertTransforms[format],
})),
...arrayify(asserts).map((assert) => ({
...assert,
transform:
assert.transform || assertTransforms[format], // Default transform
})),
].filter((a) => !!a), // Filter out any undefined assertions
})
),
}

Expand Down
5 changes: 5 additions & 0 deletions packages/core/src/types/prompt_template.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -507,6 +507,11 @@ interface PromptTest {
* Additional deterministic assertions.
*/
asserts?: PromptAssertion | PromptAssertion[]

/**
* Determines what kind of output is sent back to the test engine. Default is "text".
*/
format?: "text" | "json"
}

interface ContentSafetyOptions {
Expand Down