> ## Documentation Index > Fetch the complete documentation index at: https://docs.ai21.com/llms.txt > Use this file to discover all available pages before exploring further. # Chat response ## Response details ### Non-streaming results A successful non-streamed response includes the following members: Unique ID for each request (not message). Same ID for all responses in a streaming response. One or more responses, depending on the `n` parameter from the request. Each response includes the following members. Zero-based index of the message in the list of messages. Note that this might not correspond with the position in the response list. The message generated by the model. Includes two fields: `role` and `content`. Tool calls only occur if a tools parameter was specified in the request. These tool calls apply solely to the current message, and returned values should be added to the message thread in both the assistant message tool\_calls fields and the tool message. ID of the tool call, generated by the model. The type of tool called. Currently the only possible value is "function". The invoked function. The name of the function, which you specified in your request. A JSON object containing the function's parameters and values. Why the message ended. The response ended naturally as a complete answer (due to end-of-sequence token) or because the model generated a stop sequence provided in the request. The response ended by reaching max\_tokens. The token counts for this request. Per-token billing is based on the prompt token and completion token counts and rates. Number of tokens in the prompt for this request. The prompt token contains the entire message history and extra tokens for combining messages, proportional to the number of messages. Number of tokens in the response message. prompt\_tokens and completion\_tokens. ### Streamed results Setting `stream = true` in the request will return a stream of messages, each containing one token. You can read more about streaming calls using the [SDK](https://github.com/AI21Labs/ai21-python/blob/main/README.md#Streaming). The final message will be `data: [DONE]`. All other messages will have `data` set to a JSON object with the following fields: An object containing either an object with the following members, or the string "DONE" for the last message. Unique ID for each request (not message). Same ID for all streaming responses. An array with one object containing the following fields: Always zero. * The first message in the stream will be an object set to `{"role":"assistant"}`. * Subsequent messages will have an object `{"content": **token**}` with the generated token. Why the message ended. The last message includes this field, which shows the total token counts for the request. Per-token billing is based on the prompt token and completion token counts and rates. When present, it contains a null value except for the last chunk which contains the token usage statistics for the entire request. Number of tokens in the prompt for this request. The prompt token contains the entire message history and extra tokens for combining messages, proportional to the number of messages. Number of tokens in the response message. prompt\_tokens and completion\_tokens. `usage` will be `null` except for the last chunk which contains the token usage statistics for the entire request. ```python Python (Non-streaming results) theme={"system"} import asyncio from ai21 import AsyncAI21Client from ai21.models.chat import ChatMessage messages = [ChatMessage(content="What is the meaning of life?", role="user")] client = AsyncAI21Client() async def main(): response = await client.chat.completions.create( messages=messages, model="jamba-large", stream=True, ) async for chunk in response: print(chunk.choices[0].delta.content, end="") asyncio.run(main()) ``` ```python Python (Streamed results) theme={"system"} from ai21 import AI21Client from ai21.models.chat import ChatMessage messages = [ChatMessage(content="What is the meaning of life?", role="user")] client = AI21Client() response = client.chat.completions.create( messages=messages, model="jamba-large", stream=True, ) for chunk in response: print(chunk.choices[0].delta.content, end="") ``` *** ## Error Codes 500 - Internal Server Error\ 429 - Too Many Requests (You are sending requests too quickly.)\ 503 - Service Unavailable (The engine is currently overloaded, please try again later)\ 401 - Unauthorized (Incorrect API key provided/Invalid Authentication)\ 403 - Access Denied\ 422 - Unprocurable Entity (Request body is malformed) ***