Skip to main content

Introducing AI21 Maestro

AI21 Maestro is an AI system for rapidly creating and deploying RAG agents that automate high-value, data-intensive business tasks. At the core of AI21 Maestro is a new type of agent intelligence, optimized to find the smartest way to search, reason, validate, and adapt in real time to accomplish tasks, while staying within your cost and latency requirements.

Key Benefits

  • Reliable Results with Built-In Validation
    AI21 Maestro delivers accurate, high-quality outputs by selecting optimal tools, scaling compute resources as needed, and rigorously validating each step, all within your latency and cost constraints.
  • Scalable and Fast to Deploy AI21 Maestro reduces time-to-value by automatically creating tailored execution plans. Simply define your goals, connect tools, and set your budget, AI21 Maestro handles the rest.
  • Full Transparency and Traceability Every result includes an execution trace and a structured validation report, showing exactly how the system performed against your stated requirements.

Product Details

At its core, AI21 Maestro is a dynamic planning system that determines the optimal sequence of actions to solve a given task during inference time. The system excels at self-validation and correction, continuously evaluating outputs against your specified requirements.

Essentially, each call to AI21 Maestro builds a tree of calls to LLMs and other tools.
Based on the task requirements and available budget, AI21 Maestro strategically plans which techniques to employ.

Model Selection

AI21 Maestro is model-agnostic, it can orchestrate tasks using AI21’s first-party models or third-party models hosted by other providers.

You can specify which model to use for a given run, or let Maestro automatically select the most suitable one based on your requirements.

This flexibility lets you balance performance, latency, and cost while maintaining a unified interface. Available model types:
  • First-Party Models (AI21-hosted)
    Optimized for reasoning and retrieval tasks, managed directly by AI21.
    Examples: jamba-large, jamba-mini.
  • Third-Party Models (Managed by AI21)
    Access popular external models (e.g., OpenAI, Anthropic, Google) directly through the AI21 API, no extra setup required.
    Examples: gpt-4.1, claude-4-sonnet, gemini-2.5-flash, mistral-7b , mistral-7b.
  • Third-Party Models (BYOK)
    Use your own API keys to access external models securely through Maestro.
    Supported providers include OpenAI, Anthropic, and Google.
    Configure BYOK models on the Third-Party Models page, then reference their IDs in your requests.

Budget Control

The budget parameter lets you control that balance between speed, cost, and reliability. A higher budget allows Maestro to explore more reasoning paths, take multiple execution steps, and validate results more thoroughly which can improve accuracy, but also increases latency and cost. A lower budget returns results faster and with lower compute usage, making it ideal for simpler or time-sensitive tasks. Budget levels:
  • low: Fastest and most cost-efficient. One execution step is taken with minimal effort.
  • medium: Balanced performance for typical use cases. Multiple execution steps with moderate effort.
  • high: Maximum reliability for complex or high-stakes tasks. Multiple strategies and validation cycles are applied.
Defaults to _lowif not specified.

Response Language

AI21 Maestro can return outputs in multiple languages.
You can control the response language for each run using the response_language parameter.
Supported languages include Arabic, Dutch, English, French, German, Hebrew, Italian, Portuguese, and Spanish.
If not specified, Maestro defaults to English.

Saving and Reusing Agents

You can use AI21 Maestro to save agents and quickly reuse them in future API calls without redefining their configuration. A saved agent stores its name, instructions, tools, and configuration, so you don’t need to redefine them each time you invoke it.
This makes it easy to keep your workflows consistent and efficient.

Supported Use Cases

Deep research agents for high-stakes tasks:
  • Financial report generation
  • RFP response generation
  • High-CapEx equipment troubleshooting
  • M&A due diligence
  • Organization compliance review
  • Contract portfolio analysis
Complex Document Analysis:
  • Financial document summarization
  • Investment prospectus analysis
  • Clinical trial results analysis
  • Technical documentation comparison
  • Loan application evaluation
  • Insurance claim analysis
High-accuracy information parsing and extraction:
  • Legacy systems data migration
  • Customer interactions intelligence
  • Medical history encoding
  • Supply chain data standardization
  • Clinical trial results analysis
  • Patent claim element extraction
  • Contract term extraction