VO Walkthrough Guide
A comprehensive guide to AI21 Maestro’s Validated Output capabilities.
AI21 Maestro is an intelligent agentic system designed to handle complex AI workflows.
This guide focuses specifically on the Validated Output, with practical examples ranging from basic usage to advanced scenarios.
Understanding the Problem
Traditional LLM interactions often look like this:
# Traditional approach - unreliable
response = client.complete(
prompt="""Write a Python function that:
- Calculates fibonacci numbers
- Is under 10 lines
- Has proper docstrings
- Uses descriptive variable names"""
)
# Sometimes works perfectly, sometimes doesn't follow all constraints
Common issues:
- Inconsistent adherence to multiple constraints
- No way to know which requirements were missed
- Manual trial-and-error to get desired output
How AI21 Maestro's Validated Output Works
AI21 Maestro's Validated Output uses a Generate → Validate → Fix cycle:
- Generate: Creates initial response following your requirements
- Validate: Evaluates and scores each requirement (0.0 to 1.0)
- Fix: Refines output for requirements that scored < 1.0
- Repeat: Continues until all requirements are met or budget is exhausted
This systematic approach to instruction following is part of AI21 Maestro's broader agentic architecture, designed to handle complex workflows with reliability and precision.
Input + Requirements → Generate → Validate → Fix → Final Output + Report
↑ ↓
← ← ← ← ← ←
Using the API
The Input parameter
You can pass a string to AI21 Maestro as an input and it will be treated as a user message.
from ai21 import AI21Client
client = AI21Client(api_key="your-api-key")
# The following function will block until the default timeout is reached
client.beta.maestro.runs.create_and_poll(
input="Explain quantum computing to a 10-year-old",
requirements=[
{
"name": "reading_level",
"description": "Use simple words appropriate for a 10-year-old"
},
{
"name": "length",
"description": "Keep explanation under 100 words"
}
]
)
Alternatively you can pass an input as an array of message to support multiple turns in a conversation.
input=[
{
"role": "user",
"content": "Explain quantum computing to a 10-year-old",
},
{
"role": "assistant",
"content": 'Quantum computing is like a super-smart computer that uses tiny things called "qubits" instead of regular bits. While regular bits are like tiny switches that can be off (0) or on (1), qubits can be both at the same time! This helps quantum computers solve really hard problems much faster than normal computers by trying many possibilities at once',
},
{
"role": "user",
"content": "Translate this to spanish",
},
],
Working with Requirements
Writing Effective Requirements
Good Requirements:
requirements = [
{
"name": "word_count",
"description": "Response must be exactly between 150-200 words"
},
{
"name": "json_format",
"description": "Output must be valid JSON with 'title' and 'content' fields"
},
{
"name": "no_technical_jargon",
"description": "Avoid technical terms; explain concepts in plain English"
}
]
Requirements to Avoid:
# Too vague
{"name": "good_quality", "description": "Make it good"}
# Contradictory
{"name": "short_and_detailed", "description": "Be brief but very detailed"}
# Unmeasurable
{"name": "creative", "description": "Be creative and original"}
Requirement Categories
Format Requirements:
{
"name": "markdown_format",
"description": "Use proper markdown with headers, bullet points, and code blocks"
}
Content Requirements:
{
"name": "include_examples",
"description": "Provide at least 2 concrete examples for each concept"
}
Style Requirements:
{
"name": "professional_tone",
"description": "Use formal business language, avoid contractions and slang"
}
Technical Requirements:
{
"name": "python_best_practices",
"description": "Follow PEP 8 style guidelines and use type hints"
}
Understanding the Requirements Report
Enable detailed reporting by including requirements_result:
run = client.beta.maestro.runs.create_and_poll(
input="Write a product review for a smartphone",
requirements=[
{"name": "word_count", "description": "use 200-250 words"},
{"name": "pros_and_cons", "description": "Include both pros and cons sections"},
{"name": "rating", "description": "End with a 1-5 star rating. For example: (★★★★☆)"}
],
include=["requirements_result"],
budget="low"
)
print(f"Result: {run.result}")
# Analyze the results
print(f"Overall Score: {run.requirements_result["score"]}")
print(f"Completion Reason: {run.requirements_result["finish_reason"]}")
print("Requirements Results:")
for req in run.requirements_result["requirements"]:
print(f" {req["name"]}: {req["score"]}")
print(f" Issue: {req["reason"]}")
Sample Output Analysis
# Example output
Overall Score: 0.67
Completion Reason: Budget exhausted
word_count: 1.0
pros_and_cons: 1.0
rating: 0.6
Issue: Rating format is '4 out of 5' instead of star format (★★★★☆)
This tells you:
- 2 out of 3 requirements were perfectly met
- The rating requirement needs refinement
- You might need a higher budget or clearer requirement
Budget Control and Performance
Budget Levels Explained
# High Budget - Maximum reliability (~100 seconds for complex tasks)
run = client.beta.maestro.runs.create_and_poll(
input=task,
requirements=requirements,
budget="high"
)
# Medium Budget - Balanced approach (~60 seconds)
run = client.beta.maestro.runs.create_and_poll(
input=task,
requirements=requirements,
budget="medium"
)
# Low Budget - enhanced reliability but favors latency (~20 seconds)
run = client.beta.maestro.runs.create_and_poll(
input=task,
requirements=requirements,
budget="low"
)
Using Third-Party Models
run = client.beta.maestro.runs.create_and_poll(
input=task,
requirements=requirements,
models=["gpt-4o"], # Specify preferred model
budget="high"
)
Updated 6 days ago