question, answer, context)System-managed fields on examples (id, created_at, updated_at) are auto-generated by the server -- never include them in create or append payloads.
Proceed directly with the task — run the ax command you need. Do NOT check versions, env vars, or profiles upfront.
If an ax command fails, troubleshoot based on the error:
command not found or version error → see references/ax-setup.md401 Unauthorized / missing API key → run ax profiles show to inspect the current profile. If the profile is missing or the API key is wrong: check .env for ARIZE_API_KEY and use it to create/update the profile via references/ax-profiles.md. If .env has no key either, ask the user for their Arize API key (https://app.arize.com/admin > API Keys).env for ARIZE_SPACE_ID, or run ax spaces list -o json, or ask the user.env for ARIZE_DEFAULT_PROJECT, or ask, or run ax projects list -o json --limit 100 and present as selectable optionsax datasets listBrowse datasets in a space. Output goes to stdout.
ax datasets list
ax datasets list --space-id SPACE_ID --limit 20
ax datasets list --cursor CURSOR_TOKEN
ax datasets list -o json
| Flag | Type | Default | Description |
|---|---|---|---|
--space-id |
string | from profile | Filter by space |
--limit, -l |
int | 15 | Max results (1-100) |
--cursor |
string | none | Pagination cursor from previous response |
-o, --output |
string | table | Output format: table, json, csv, parquet, or file path |
-p, --profile |
string | default | Configuration profile |
ax datasets getQuick metadata lookup -- returns dataset name, space, timestamps, and version list.
ax datasets get DATASET_ID
ax datasets get DATASET_ID -o json
| Flag | Type | Default | Description |
|---|---|---|---|
DATASET_ID |
string | required | Positional argument |
-o, --output |
string | table | Output format |
-p, --profile |
string | default | Configuration profile |
| Field | Type | Description |
|---|---|---|
id |
string | Dataset ID |
name |
string | Dataset name |
space_id |
string | Space this dataset belongs to |
created_at |
datetime | When the dataset was created |
updated_at |
datetime | Last modification time |
versions |
array | List of dataset versions (id, name, dataset_id, created_at, updated_at) |
ax datasets exportDownload all examples to a file. Use --all for datasets larger than 500 examples (unlimited bulk export).
ax datasets export DATASET_ID
# -> dataset_abc123_20260305_141500/examples.json
ax datasets export DATASET_ID --all
ax datasets export DATASET_ID --version-id VERSION_ID
ax datasets export DATASET_ID --output-dir ./data
ax datasets export DATASET_ID --stdout
ax datasets export DATASET_ID --stdout | jq '.[0]'
| Flag | Type | Default | Description |
|---|---|---|---|
DATASET_ID |
string | required | Positional argument |
--version-id |
string | latest | Export a specific dataset version |
--all |
bool | false | Unlimited bulk export (use for datasets > 500 examples) |
--output-dir |
string | . |
Output directory |
--stdout |
bool | false | Print JSON to stdout instead of file |
-p, --profile |
string | default | Configuration profile |
Agent auto-escalation rule: If an export returns exactly 500 examples, the result is likely truncated — re-run with --all to get the full dataset.
Export completeness verification: After exporting, confirm the row count matches what the server reports:
# Get the server-reported count from dataset metadata
ax datasets get DATASET_ID -o json | jq '.versions[-1] | {version: .id, examples: .example_count}'
# Compare to what was exported
jq 'length' dataset_*/examples.json
# If counts differ, re-export with --all
Output is a JSON array of example objects. Each example has system fields (id, created_at, updated_at) plus all user-defined fields:
[
{
"id": "ex_001",
"created_at": "2026-01-15T10:00:00Z",
"updated_at": "2026-01-15T10:00:00Z",
"question": "What is 2+2?",
"answer": "4",
"topic": "math"
}
]
ax datasets createCreate a new dataset from a data file.
ax datasets create --name "My Dataset" --space-id SPACE_ID --file data.csv
ax datasets create --name "My Dataset" --space-id SPACE_ID --file data.json
ax datasets create --name "My Dataset" --space-id SPACE_ID --file data.jsonl
ax datasets create --name "My Dataset" --space-id SPACE_ID --file data.parquet
| Flag | Type | Required | Description |
|---|---|---|---|
--name, -n |
string | yes | Dataset name |
--space-id |
string | yes | Space to create the dataset in |
--file, -f |
path | yes | Data file: CSV, JSON, JSONL, or Parquet |
-o, --output |
string | no | Output format for the returned dataset metadata |
-p, --profile |
string | no | Configuration profile |
Use --file - to pipe data directly — no temp file needed:
echo '[{"question": "What is 2+2?", "answer": "4"}]' | ax datasets create --name "my-dataset" --space-id SPACE_ID --file -
# Or with a heredoc
ax datasets create --name "my-dataset" --space-id SPACE_ID --file - << 'EOF'
[{"question": "What is 2+2?", "answer": "4"}]
EOF
To add rows to an existing dataset, use ax datasets append --json '[...]' instead — no file needed.
| Format | Extension | Notes |
|---|---|---|
| CSV | .csv |
Column headers become field names |
| JSON | .json |
Array of objects |
| JSON Lines | .jsonl |
One object per line (NOT a JSON array) |
| Parquet | .parquet |
Column names become field names; preserves types |
Format gotchas:
null becomes empty string. Use JSON/Parquet to preserve types.[{...}, {...}]) in a .jsonl file will fail — use .json extension instead.pandas/pyarrow to read locally: pd.read_parquet("examples.parquet").ax datasets appendAdd examples to an existing dataset. Two input modes -- use whichever fits.
Generate the payload directly -- no temp files needed:
ax datasets append DATASET_ID --json '[{"question": "What is 2+2?", "answer": "4"}]'
ax datasets append DATASET_ID --json '[
{"question": "What is gravity?", "answer": "A fundamental force..."},
{"question": "What is light?", "answer": "Electromagnetic radiation..."}
]'
ax datasets append DATASET_ID --file new_examples.csv
ax datasets append DATASET_ID --file additions.json
ax datasets append DATASET_ID --json '[{"q": "..."}]' --version-id VERSION_ID
| Flag | Type | Required | Description |
|---|---|---|---|
DATASET_ID |
string | yes | Positional argument |
--json |
string | mutex | JSON array of example objects |
--file, -f |
path | mutex | Data file (CSV, JSON, JSONL, Parquet) |
--version-id |
string | no | Append to a specific version (default: latest) |
-o, --output |
string | no | Output format for the returned dataset metadata |
-p, --profile |
string | no | Configuration profile |
Exactly one of --json or --file is required.
Schema validation before append: If the dataset already has examples, inspect its schema before appending to avoid silent field mismatches:
# Check existing field names in the dataset
ax datasets export DATASET_ID --stdout | jq '.[0] | keys'
# Verify your new data has matching field names
echo '[{"question": "..."}]' | jq '.[0] | keys'
# Both outputs should show the same user-defined fields
Fields are free-form: extra fields in new examples are added, and missing fields become null. However, typos in field names (e.g., queston vs question) create new columns silently -- verify spelling before appending.
ax datasets deleteax datasets delete DATASET_ID
ax datasets delete DATASET_ID --force # skip confirmation prompt
| Flag | Type | Default | Description |
|---|---|---|---|
DATASET_ID |
string | required | Positional argument |
--force, -f |
bool | false | Skip confirmation prompt |
-p, --profile |
string | default | Configuration profile |
Users often refer to datasets by name rather than ID. Resolve a name to an ID before running other commands:
# Find dataset ID by name
ax datasets list -o json | jq '.[] | select(.name == "eval-set-v1") | .id'
# If the list is paginated, fetch more
ax datasets list -o json --limit 100 | jq '.[] | select(.name | test("eval-set")) | {id, name}'
input, expected_output)
--file - (see the Create Dataset section)ax datasets create --name "eval-set-v1" --space-id SPACE_ID --file eval_data.csv
ax datasets get DATASET_ID
# Find the dataset
ax datasets list
# Append inline or from a file (see Append Examples section for full syntax)
ax datasets append DATASET_ID --json '[{"question": "...", "answer": "..."}]'
ax datasets append DATASET_ID --file additional_examples.csv
ax datasets list -- find the datasetax datasets export DATASET_ID -- download to filejq '.[] | .question' dataset_*/examples.json
# List versions
ax datasets get DATASET_ID -o json | jq '.versions'
# Export that version
ax datasets export DATASET_ID --version-id VERSION_ID
ax datasets export DATASET_ID
ax datasets append DATASET_ID --file new_rows.csv
ax datasets create --name "eval-set-v2" --space-id SPACE_ID --file updated_data.json
# Count examples
ax datasets export DATASET_ID --stdout | jq 'length'
# Extract a single field
ax datasets export DATASET_ID --stdout | jq '.[].question'
# Convert to CSV with jq
ax datasets export DATASET_ID --stdout | jq -r '.[] | [.question, .answer] | @csv'
Examples are free-form JSON objects. There is no fixed schema -- columns are whatever fields you provide. System-managed fields are added by the server:
| Field | Type | Managed by | Notes |
|---|---|---|---|
id |
string | server | Auto-generated UUID. Required on update, forbidden on create/append |
created_at |
datetime | server | Immutable creation timestamp |
updated_at |
datetime | server | Auto-updated on modification |
| (any user field) | any JSON type | user | String, number, boolean, null, nested object, array |
arize-trace
arize-experiment
arize-prompt-optimization
| Problem | Solution |
|---|---|
ax: command not found |
See references/ax-setup.md |
401 Unauthorized |
API key is wrong, expired, or doesn't have access to this space. Fix the profile using references/ax-profiles.md. |
No profile found |
No profile is configured. See references/ax-profiles.md to create one. |
Dataset not found |
Verify dataset ID with ax datasets list |
File format error |
Supported: CSV, JSON, JSONL, Parquet. Use --file - to read from stdin. |
platform-managed column |
Remove id, created_at, updated_at from create/append payloads |
reserved column |
Remove time, count, or any source_record_* field |
Provide either --json or --file |
Append requires exactly one input source |
Examples array is empty |
Ensure your JSON array or file contains at least one example |
not a JSON object |
Each element in the --json array must be a {...} object, not a string or number |
See references/ax-profiles.md § Save Credentials for Future Use.