Chat Request

Overview

The Jamba API provides access to a set of instruction-following chat models. It describes the details for interacting with the chat model via the API endpoint and provides specifications for request and response structures.

Request body

model

string

required

The name of the model to use.
You can call our model without specifying a version by using the following model names:

jamba-large
jamba-mini

For more information on the available model versions, click here.

messages

object[]

required

A list of messages representing the conversation history. The structure of the message object depends on the type:

Show properties

system message

object

An initial system message is optional but recommended to set the tone of the chat.

Show properties

role

string

required

The role of the entity that is creating the message.

content

string

required

The content of the message.

user message

object

Input provided by the user.

Show properties

role

string

required

The role of the entity that is creating the message, in this case the user.

content

string

required

The content of the message.

assistant message

object

Response generated by the model. Include this in your request to provide context for future answers.

Show properties

role

string

required

The role of the entity that is creating the message, in this case the assistant.

content

string

required

The content of the message.

tool_calls

object[]

The function calls generated by the model, such as tool invocations.

string

The id of the tool call.

type

string

The type of tool.

function

object

The function invoked by the model.

name

string

The name of the function.

arguments

JSON string

The parameters of the function as a JSON schema.

tool message

object

Contains the output of a tool. Add the function output for user-implemented tools to enable a user-friendly model response. If included, ensure an assistant message with a tool_calls entry with a matching id exists.

Show properties

role

string

required

The role of the entity that is creating the message, in this case the tool.

content

string

required

The content of the message.

tool_call_id

string

required

The message is a response to this tool call.

tools

object[]

A list of tools that the model can use when generating a response.
Currently, the only function type tools are supported.

Show properties

type

string

required

The type of tool. Currently, the only supported value is “function”.

function

object

required

Describes a function to call. Currently, all functions must be described by the user; there are no built-in functions. An example function template is given below.

name

string

required

The name of the function.

description

object

required

Provide a complete description of what the function does, what it returns, and any limitations.

parameter

object

Each function parameter has a name, a type (“string”, “integer”, “float”, “array”, “boolean”, or “enum”), and a description.

documents

object[]

The document parameter accepts a list of objects, each containing multiple fields.

Show properties

content

string

required

The content of this “document”.

metadata

object[]

Key-value pairs describing the document:

key

string

required

Type of metadata, like ‘author’, ‘date’, ‘url’, etc. Should be things the model understands.

value

string

required

Value of the metadata.

response_format

object

An object defining the output format required from the model.
Setting it to { "type": "json_object" } activates JSON mode, ensuring the generated message adheres to valid JSON structure.

max_tokens

integer

The maximum number of tokens the model can generate in its response.
For Jamba models, the maximum allowed value is 4096 tokens.

temperature

float

Controls the variety of responses provided—a higher value results in more diverse answers.
Default: 0.4, Range: 0.0–2.0
More information

top_p

float

Limit the pool of next tokens in each step to the top N percentile of possible tokens, where 1.0 means the pool of all possible tokens, and 0.01 means the pool of only the most likely next tokens.
Default: 1.0, Range: 0 <= value <=1.0
More information

stop

string[]

End the message when the model generates one of these strings. The stop sequence is not included in the generated message. Each sequence can be up to 64K long, and can contain newlines as n characters.

integer

How many chat responses to generate. Default:1, Range: 1 – 16.
Notes:

If n > 1, setting temperature = 0 will fail because all answers are guaranteed to be duplicates.
n must be 1 when stream = True

stream

boolean

Stream results one token at a time using server-sent events. Useful for long results to avoid long wait times. If True, n must be 1. Must be False if using tools.

from ai21 import AI21Client
from ai21.models.chat import ChatMessage

messages = [
    ChatMessage(role="user", content="Hello how are you?"),
]

client = AI21Client()

client.chat.completions.create(
    messages=messages,
    model="jamba-large",
    max_tokens=1024,
)

Using the APIs

Foundation Models

AI21 Maestro [Beta]

File Library Management

Overview

Request body

Using the APIs

Foundation Models

AI21 Maestro [Beta]

File Library Management

​Overview

​Request body

Overview

Request body