Skip to main content
AI21 Maestro is an intelligent agentic system designed to handle complex AI workflows.
This guide focuses specifically on the Validated Output, providing practical examples that range from basic usage to advanced scenarios.

Understanding the Problem

Traditional LLM interactions often look like this:
Python
# Traditional approach - unreliable
response = client.complete(
    prompt="""Write a Python function that:
    - Calculates fibonacci numbers
    - Is under 10 lines
    - Has proper docstrings
    - Uses descriptive variable names"""
)
# Sometimes works perfectly, sometimes doesn't follow all constraints
Common issues:
  • LLMs often fail to consistently meet all the individual requirements outlined in the promp
  • There is no visibility into which requirements were not met
  • Requires manual trial and error to achieve the desired output

How AI21 Maestro Works

Maestro’s instruction following enhancer uses a Generate → Validate → Fix cycle:
  1. Generate: Creates initial response following your requirements
  2. Validate: Evaluates and scores each requirement (0.0 to 1.0)
  3. Fix: Refines output for requirements that scored < 1.0
  4. Repeat: Continues until all requirements are met or budget is exhausted
This systematic approach to instruction following is part of Maestro’s broader agentic architecture, designed to handle complex workflows with reliability and precision.
text
Input + Requirements → Generate → Validate → Fix → Final Output + Report
                                    ↑             ↓
                                    ← ← ← ← ← 

Using the API

The Input parameter You can pass a string to Maestro as an input and it will be treated as a user message.
Python
from ai21 import AI21Client

client = AI21Client(api_key="your-api-key")

# The following function will block until the default timeout is reached
client.beta.maestro.runs.create_and_poll(
    input="Explain quantum computing to a 10-year-old",
    requirements=[
        {
            "name": "reading_level",
            "description": "Use simple words appropriate for a 10-year-old"
        },
        {
            "name": "length",
            "description": "Keep explanation under 100 words"
        }
    ]
)
Alternatively you can pass an input as an array of message to support multiple turns in a conversation.
Python
input=[
        {
            "role": "user",
            "content": "Explain quantum computing to a 10-year-old",
        },
        {
            "role": "assistant",
            "content": 'Quantum computing is like a super-smart computer that uses tiny things called "qubits" instead of regular bits. While regular bits are like tiny switches that can be off (0) or on (1), qubits can be both at the same time! This helps quantum computers solve really hard problems much faster than normal computers by trying many possibilities at once',
        },
        {
            "role": "user",
            "content": "Translate this to spanish",
        },
    ],

System Prompt

The system_prompt defines the agent’s identity, operating principles, and boundaries before it processes any input.
It guides how Maestro interprets inputs, chooses tools, and reasons throughout the run.
Use it to:
  • Define the agent’s role and identity
    Example: “You are a cautious financial journalist. Verify all data before reporting”
  • Provide context or environment
    Example: Today’s date is November 10, 2025. User location: New York.
  • Define behavioral rules
    Example: “Always verify numbers from reliable sources before reporting. If data is unclear, ask a clarifying question.”

run = client.beta.maestro.runs.create_and_poll(
    system_prompt="You are a cautious financial journalist. Verify all data before reporting.",
    input="Write a brief update on today's top stock movements.",
    requirements=[
        {"name": "word_limit", "description": "No more than 120 words"},
        {"name": "tone", "description": "Neutral, professional tone"}
    ],
    budget="medium",
    tools=[
        {
            "type": "web_search",
            "urls": ["https://finance.yahoo.com", "https://www.reuters.com"]
        }
    ],
    include=["requirements_result"]
)

print(run.result)
print(run.requirements_result)

Working with Requirements

Writing Effective Requirements

Good Requirements:
Python
requirements = [
    {
        "name": "word_count",
        "description": "Response must be exactly between 150-200 words"
    },
    {
        "name": "json_format",
        "description": "Output must be valid JSON with 'title' and 'content' fields"
    },
    {
        "name": "no_technical_jargon",
        "description": "Avoid technical terms; explain concepts in plain English"
    }
]
Requirements to Avoid:
Python
# Too vague
{"name": "good_quality", "description": "Make it good"}

# Contradictory
{"name": "short_and_detailed", "description": "Be brief but very detailed"}

# Unmeasurable
{"name": "creative", "description": "Be creative and original"}

Requirement Categories

Format Requirements:
Python
{
    "name": "markdown_format",
    "description": "Use proper markdown with headers, bullet points, and code blocks"
}
Content Requirements:
Python
{
    "name": "include_examples",
    "description": "Provide at least 2 concrete examples for each concept"
}
Style Requirements:
Python
{
    "name": "professional_tone",
    "description": "Use formal business language, avoid contractions and slang"
}
Technical Requirements:
Python
{
    "name": "python_best_practices",
    "description": "Follow PEP 8 style guidelines and use type hints"
}

Requirements Report

Enable detailed reporting by including requirements_result:
Python
run = client.beta.maestro.runs.create_and_poll(
    input="Write a product review for a smartphone",
    requirements=[
        {"name": "word_count", "description": "use 200-250 words"},
        {"name": "pros_and_cons", "description": "Include both pros and cons sections"},
        {"name": "rating", "description": "End with a 1-5 star rating. For example: (★★★★☆)"}
    ],
    include=["requirements_result"],
    budget="low"
)

print(f"Result: {run.result}")

# Analyze the results
print(f"Overall Score: {run.requirements_result["score"]}")
print(f"Completion Reason: {run.requirements_result["finish_reason"]}")

print("Requirements Results:")
for req in run.requirements_result["requirements"]:
    print(f"  {req["name"]}: {req["score"]}")
    print(f"   Issue: {req["reason"]}")
Sample Output Analysis
Python
# Example output
Overall Score: 0.67
Completion Reason: Budget exhausted

word_count: 1.0
pros_and_cons: 1.0
rating: 0.6
  Issue: Rating format is '4 out of 5' instead of star format (★★★★☆)
This tells you:
  • 2 out of 3 requirements were perfectly met
  • The rating requirement needs refinement
  • You might need a higher budget or clearer requirement

Budget Control and Performance

Use the budget parameter to control how much computational effort AI21 Maestro applies when executing your task. Higher budgets improve reasoning reliability but increase latency and cost.  
The snippet below shows how to set different budget levels in your Maestro run. Replace task and requirements with your own input values and make sure the client is initialized with your API key as shown in the Quickstar.
Budget Levels Explained
Python
# High Budget - Maximum reliability (~100 seconds for complex tasks)
run = client.beta.maestro.runs.create_and_poll(
    input=task,
    requirements=requirements,
    budget="high"
)

# Medium Budget - Balanced approach (~60 seconds)
run = client.beta.maestro.runs.create_and_poll(
    input=task,
    requirements=requirements,
    budget="medium"
)

# Low Budget - enhanced reliability but favors latency (~20 seconds)
run = client.beta.maestro.runs.create_and_poll(
    input=task,
    requirements=requirements,
    budget="low"
)

Using Third-Party Models

You can run Maestro tasks with both AI21 and third-party models.
Use the models parameter to specify which model to run your task with.
If no model is specified, Maestro will automatically select a suitable model based on the task requirements.
Python
run = client.beta.maestro.runs.create_and_poll(
    input=task,
    requirements=requirements,
    models=["gpt-4o"],  # Specify preferred model
    budget="high"
)

Setting the Response Language

You can control the output language of Maestro’s response using the response_language parameter. For example, to receive the result in Spanish:
run = client.beta.maestro.runs.create_and_poll(
    input=task,
    requirements=requirements,
    models=["jamba-mini"],
    budget="medium",
    response_language="spanish"
)

print(run.output_text)