Tool calls only occur if a tools parameter was specified in the request. These tool calls apply solely to the current message, and returned values should be added to the message thread in both the assistant message tool_calls fields and the tool message.
The response ended naturally as a complete answer (due to end-of-sequence token) or because the model generated a stop sequence provided in the request.
Number of tokens in the prompt for this request.
The prompt token contains the entire message history and extra tokens for combining messages, proportional to the number of messages.
Setting stream = true in the request will return a stream of messages, each containing one token. You can read more about streaming calls using the SDK.
The final message will be data: [DONE]. All other messages will have data set to a JSON object with the following fields:
The last message includes this field, which shows the total token counts for the request. Per-token billing is based on the prompt token and completion token counts and rates.
When present, it contains a null value except for the last chunk which contains the token usage statistics for the entire request.
Number of tokens in the prompt for this request.
The prompt token contains the entire message history and extra tokens for combining messages, proportional to the number of messages.
500 - Internal Server Error
429 - Too Many Requests (You are sending requests too quickly.)
503 - Service Unavailable (The engine is currently overloaded, please try again later)
401 - Unauthorized (Incorrect API key provided/Invalid Authentication)
403 - Access Denied
422 - Unprocurable Entity (Request body is malformed)