Documentation Index
Fetch the complete documentation index at: https://docs.ai21.com/llms.txt
Use this file to discover all available pages before exploring further.
AI21 Maestro is an intelligent agentic system designed to handle complex AI workflows.
This guide focuses specifically on the Validated Output, providing practical examples that range from basic usage to advanced scenarios.
Understanding the Problem
Traditional LLM interactions often look like this:
# Traditional approach - unreliable
response = client.complete(
prompt="""Write a Python function that:
- Calculates fibonacci numbers
- Is under 10 lines
- Has proper docstrings
- Uses descriptive variable names"""
)
# Sometimes works perfectly, sometimes doesn't follow all constraints
Common issues:
- LLMs often fail to consistently meet all the individual requirements outlined in the promp
- There is no visibility into which requirements were not met
- Requires manual trial and error to achieve the desired output
How AI21 Maestro Works
Maestro’s instruction following enhancer uses a Generate → Validate → Fix cycle:
- Generate: Creates initial response following your requirements
- Validate: Evaluates and scores each requirement (0.0 to 1.0)
- Fix: Refines output for requirements that scored < 1.0
- Repeat: Continues until all requirements are met or budget is exhausted
This systematic approach to instruction following is part of Maestro’s broader agentic architecture, designed to handle complex workflows with reliability and precision.
Input + Requirements → Generate → Validate → Fix → Final Output + Report
↑ ↓
← ← ← ← ←
Using the API
The Input parameter
You can pass a string to Maestro as an input and it will be treated as a user message.
from ai21 import AI21Client
client = AI21Client(api_key="your-api-key")
# The following function will block until the default timeout is reached
client.beta.maestro.runs.create_and_poll(
input="Explain quantum computing to a 10-year-old",
requirements=[
{
"name": "reading_level",
"description": "Use simple words appropriate for a 10-year-old"
},
{
"name": "length",
"description": "Keep explanation under 100 words"
}
]
)
Alternatively you can pass an input as an array of message to support multiple turns in a conversation.
input=[
{
"role": "user",
"content": "Explain quantum computing to a 10-year-old",
},
{
"role": "assistant",
"content": 'Quantum computing is like a super-smart computer that uses tiny things called "qubits" instead of regular bits. While regular bits are like tiny switches that can be off (0) or on (1), qubits can be both at the same time! This helps quantum computers solve really hard problems much faster than normal computers by trying many possibilities at once',
},
{
"role": "user",
"content": "Translate this to spanish",
},
],
System Prompt
The system_prompt defines the agent’s identity, operating principles, and boundaries before it processes any input.
It guides how Maestro interprets inputs, chooses tools, and reasons throughout the run.
Use it to:
- Define the agent’s role and identity
Example: “You are a cautious financial journalist. Verify all data before reporting”
- Provide context or environment
Example: Today’s date is November 10, 2025. User location: New York.
- Define behavioral rules
Example: “Always verify numbers from reliable sources before reporting. If data is unclear, ask a clarifying question.”
run = client.beta.maestro.runs.create_and_poll(
system_prompt="You are a cautious financial journalist. Verify all data before reporting.",
input="Write a brief update on today's top stock movements.",
requirements=[
{"name": "word_limit", "description": "No more than 120 words"},
{"name": "tone", "description": "Neutral, professional tone"}
],
budget="medium",
tools=[
{
"type": "web_search",
"urls": ["https://finance.yahoo.com", "https://www.reuters.com"]
}
],
include=["requirements_result"]
)
print(run.result)
print(run.requirements_result)
Working with Requirements
Writing Effective Requirements
Good Requirements:
requirements = [
{
"name": "word_count",
"description": "Response must be exactly between 150-200 words"
},
{
"name": "json_format",
"description": "Output must be valid JSON with 'title' and 'content' fields"
},
{
"name": "no_technical_jargon",
"description": "Avoid technical terms; explain concepts in plain English"
}
]
Requirements to Avoid:
# Too vague
{"name": "good_quality", "description": "Make it good"}
# Contradictory
{"name": "short_and_detailed", "description": "Be brief but very detailed"}
# Unmeasurable
{"name": "creative", "description": "Be creative and original"}
Requirement Categories
Format Requirements:
{
"name": "markdown_format",
"description": "Use proper markdown with headers, bullet points, and code blocks"
}
Content Requirements:
{
"name": "include_examples",
"description": "Provide at least 2 concrete examples for each concept"
}
Style Requirements:
{
"name": "professional_tone",
"description": "Use formal business language, avoid contractions and slang"
}
Technical Requirements:
{
"name": "python_best_practices",
"description": "Follow PEP 8 style guidelines and use type hints"
}
Requirements Report
Enable detailed reporting by including requirements_result:
run = client.beta.maestro.runs.create_and_poll(
input="Write a product review for a smartphone",
requirements=[
{"name": "word_count", "description": "use 200-250 words"},
{"name": "pros_and_cons", "description": "Include both pros and cons sections"},
{"name": "rating", "description": "End with a 1-5 star rating. For example: (★★★★☆)"}
],
include=["requirements_result"],
budget="low"
)
print(f"Result: {run.result}")
# Analyze the results
print(f"Overall Score: {run.requirements_result["score"]}")
print(f"Completion Reason: {run.requirements_result["finish_reason"]}")
print("Requirements Results:")
for req in run.requirements_result["requirements"]:
print(f" {req["name"]}: {req["score"]}")
print(f" Issue: {req["reason"]}")
Sample Output Analysis
# Example output
Overall Score: 0.67
Completion Reason: Budget exhausted
word_count: 1.0
pros_and_cons: 1.0
rating: 0.6
Issue: Rating format is '4 out of 5' instead of star format (★★★★☆)
This tells you:
- 2 out of 3 requirements were perfectly met
- The rating requirement needs refinement
- You might need a higher budget or clearer requirement
Use the budget parameter to control how much computational effort AI21 Maestro applies when executing your task. Higher budgets improve reasoning reliability but increase latency and cost.
The snippet below shows how to set different budget levels in your Maestro run. Replace task and requirements with your own input values and make sure the client is initialized with your API key as shown in the Quickstar.
Budget Levels Explained
# High Budget - Maximum reliability (~100 seconds for complex tasks)
run = client.beta.maestro.runs.create_and_poll(
input=task,
requirements=requirements,
budget="high"
)
# Medium Budget - Balanced approach (~60 seconds)
run = client.beta.maestro.runs.create_and_poll(
input=task,
requirements=requirements,
budget="medium"
)
# Low Budget - enhanced reliability but favors latency (~20 seconds)
run = client.beta.maestro.runs.create_and_poll(
input=task,
requirements=requirements,
budget="low"
)
Using Third-Party Models
You can run Maestro tasks with both AI21 and third-party models.
Use the models parameter to specify which model to run your task with.
If no model is specified, Maestro will automatically select a suitable model based on the task requirements.
run = client.beta.maestro.runs.create_and_poll(
input=task,
requirements=requirements,
models=["gpt-4o"], # Specify preferred model
budget="high"
)
Setting the Response Language
You can control the output language of Maestro’s response using the response_language parameter. For example, to receive the result in Spanish:
run = client.beta.maestro.runs.create_and_poll(
input=task,
requirements=requirements,
models=["jamba-mini"],
budget="medium",
response_language="spanish"
)
print(run.output_text)