Migrating from Jurassic to Jamba

In April 2024, AI21 Labs released Jamba, a new, very powerful foundation model built on the Mamba architecture. In Q3 2024, AI21 will deprecate the older Jurassic foundation models in favor of the much more powerful Jamba models. The older models may remain available on some partner sites, but we recommend upgrading foundation models from Jurassic to Jamba.

Summary of changes

Chat and completion endpoints in Jurassic are merged into a single, chat-style endpoint in Jamba. This single endpoint handles chat, question answering, and completion.

Completion

In general, the new endpoint is designed to be interacted with in a chat mode, rather than anticipating prompt completion by default. For example, to properly trigger completion behavior in Jamba you should do this:

Jurassic completion model prompt: One fish, two fish…
Equivalent Jamba prompt: Please complete this sentence: One fish, two fish

Question answering

For question answering behavior, simply ask the question in a single-turn request:

Jamba prompt: Who was the first emperor of Rome?

Multi-turn chat

Multi-turn chat remains largely the same. For example, here is a short message thread with two user inputs.

[system message] You are a bank teller with a friendly and courteous but formal manner. Use please and thank you when appropriate.
[user message] How much money is in my account?
[assistant message] What is your account number please?
[user message] 1234565765
[assistant message] Thank you. As of today, your balance is $12.04

Read the Jamba-instruct documentation for all usage details.

Detailed changes

Here is a more complete list of changes.

Jamba has only one foundation model, at present, which replaces all variations of the Jurassic 2 models.
Chat, completion, and question answering are all handled by a single endpoint. The new endpoint is a chat style endpoint with a message history.
Chat behavior remains conceptually the same, although there are structural changes in the request and response objects.
Rest path changes:
- All the old REST paths start with /studio/v1/j2-..., for example: /studio/v1/j2-light/complete
- New single REST path: /studio/v1/chat/completions
SDK usage should remain the same, but specify jamba-instruct as the model name.
For completion or instruction mode, the new model generally expects instructions rather than examples, although you can try providing examples to your instructions if instructions only provide unsatisfying results.
The new response does not include tokens or token data, other than total token counts.
The new endpoint supports streaming responses, which returns one response per token.
A number of request parameters were dropped, renamed, changed, or added. See the table below.
The response object format has changed. See below for the changes.

Request object

Parameter changes

Jamba Instruct	Jurassic chat	Jurassic completion
`messages[object]` See new object format below	`messages[object]`	string
`n`	`numResults`	`numResults`
`max_tokens`	`maxTokens`	`maxTokens`
--	`minTokens`	`minTokens`
--	`topKReturn`	`topKReturn`
`top_p`	`topP`	`topP`
--	--	`minP`
`stop` [_string or array]_	`stopSequences`	`stopSequences`
`stream`	--	--
--	`frequencyPenalty`	`frequencyPenalty`
--	`presencePenalty`	`presencePenalty`
--	`countPenalty`	`countPenalty`
--	--	`logitBias`
--	--	`epoch`

Request object changes

Old object	New object
{ "prompt": string, "numResults": int, "maxTokens": int, "minTokens": int, "temperature": float, "topP": float, "minP": float, "topKReturn": float, "logitBias": {}, "frequencyPenalty": { … }, "presencePenalty": { … }, "countPenalty": { … }, "epoch": "int" }	{ "model": "jamba-instruct-preview", "messages": [ {"role":"user\|system\|assistant", "content":string} ], "temperature": float, "max_tokens": int, "top_p": float, "stream": bool, "n":1 }

Old object

New object

{
 "prompt": string,
 "numResults": int,
 "maxTokens": int,
 "minTokens": int,
 "temperature": float,
 "topP": float,
 "minP": float,
 "topKReturn": float,
 "logitBias": {},
 "frequencyPenalty": {
…
 },
 "presencePenalty": {
…
 },
 "countPenalty": {
…
 },
 "epoch": "int"
}

{
 "model": "jamba-instruct-preview",
 "messages": [
     {"role":"user|system|assistant", "content":string}
 ],
 "temperature": float,
 "max_tokens": int,
 "top_p": float,
 "stream": bool,
 "n":1
}

Response object changes (completion)

Old object	New object
{ "id":"string ID", "prompt":{ "text":str, "tokens":[ { "generatedToken":{ "token":"token value", "logprob":-float "raw_logprob":-float }, "topTokens":null, "textRange":{ "start":int, "end":int } } ] }, "completions":[ { "data":{ "text":"Suggested completion", "tokens":[ { "generatedToken":{ "token":"Token 1...", "logprob":-float "raw_logprob":-float }, "topTokens":null, "textRange":{ "start":0, "end":1 } } ] }, "finishReason":{ "reason":"endoftext" } } ] }	{ "id":"string ID", "choices":[ { "index":int, "message":{ "role":user\|system\|assistant, "content":string } } ], "usage":{ "prompt_tokens":int, "completion_tokens"int, "total_tokens":int } }

Old object

New object

  {
  "id":"string ID",
  "prompt":{
    "text":str,
    "tokens":[
      {
        "generatedToken":{
          "token":"token value",
          "logprob":-float
          "raw_logprob":-float
        },
        "topTokens":null,
        "textRange":{
          "start":int,
          "end":int
        }
      }
    ]
  },
  "completions":[
    {
      "data":{
        "text":"Suggested completion",
        "tokens":[
          {
            "generatedToken":{
              "token":"Token 1...",
              "logprob":-float
              "raw_logprob":-float
            },
            "topTokens":null,
            "textRange":{
              "start":0,
              "end":1
            }
          }
        ]
      },
      "finishReason":{
        "reason":"endoftext"
      }
    }
  ]
}

{
  "id":"string ID",
  "choices":[
    {
      "index":int,
      "message":{
        "role":user|system|assistant,
        "content":string
      }
    }
  ],
  "usage":{
    "prompt_tokens":int,
    "completion_tokens"int,
    "total_tokens":int
  }
}