Python SDK - with Amazon Bedrock

AI21 Studio Python SDK with Amazon Bedrock Guide

This guide covers how to use the AI21 Studio Python SDK with Bedrock integration for interaction with Jurassic-2 models.

Set up

To get started with AI21's SDK with Bedrock integration, you'll need to install it first. To do so, simply run the following command:

$ pip install -U "ai21[AWS]"

If you experienced issues in bedrock client with the current version and your version is lower than 1.3.1, please upgrade:

$ pip install -U "ai21[AWS]">=1.3.1

Using AI21 Studio Python SDK with Bedrock

To use the AI21 Studio Python SDK with Bedrock you first need to get access from Bedrock. Then, it's just a simple python SDK.

Example: Using AI21 Studio Python SDK with Bedrock for Jurassic-2

Below is a sample usage of the AI21 Python SDK with SageMaker integration to interact with Jurassic-2 models:

from ai21 import AI21BedrockClient, BedrockModelID

# J2 Ultra
client = AI21BedrockClient()

response = client.completion.create(
	model_id=BedrockModelID.J2_MID_V1,
  prompt="explain black holes to 8th graders",
  num_results=1,
  max_tokens=100,
  temperature=0.7
)

print(response)

By customizing the request parameters, you can control the content and style of the generated text. For a full list of available options, check out our Complete API page.

  • model_id: string - required
  • boto_session

All Jurassic-2 models can be interacted with using the same client.completion.create function when passed as model_id create function:

BedrockModelID.J2_MID_V1
BedrockModelID.J2_ULTRA_V1

As mentioned, it is optional to use your own boto3 session or use a default one.
The advantages of using your own boto3 session are:

  • Custom Configuration - allows you to customize various configuration options, such as the AWS region, credentials and other settings
  • Credentials Management - You can use different methods to supply credentials, such as environment variables, AWS configuration files, or IAM roles and more.

Here's an example of how to create and configure your region via a customized boto3 session:

import boto3
from ai21 import AI21BedrockClient, BedrockModelID

boto_session = boto3.Session(region_name="us-east-1")

# J2 Ultra
client = AI21BedrockClient(session=boto_session)

response = client.completion.create(
	model_id=BedrockModelID.J2_MID_V1,
  prompt="explain black holes to 8th graders",
  num_results=1,
  max_tokens=100,
  temperature=0.7
)

Response

Here's an example of a response object from executing a j2-mid model:

{
   "id":"94078cb6-687e-4262-ef8f-1d7c2b0dbd2b",
   "prompt":{
      "text":"These are a few of my favorite",
      "tokens":[
         {
            "generatedToken":{
               "token":"▁These▁are",
               "logprob":-8.824776649475098,
               "raw_logprob":-8.824776649475098
            },
            "topTokens":"None",
            "textRange":{
               "start":0,
               "end":9
            }
         },
         {
            "generatedToken":{
               "token":"▁a▁few",
               "logprob":-4.798709869384766,
               "raw_logprob":-4.798709869384766
            },
            "topTokens":"None",
            "textRange":{
               "start":9,
               "end":15
            }
         },
         {
            "generatedToken":{
               "token":"▁of▁my▁favorite",
               "logprob":-1.0864331722259521,
               "raw_logprob":-1.0864331722259521
            },
            "topTokens":"None",
            "textRange":{
               "start":15,
               "end":30
            }
         }
      ]
   },
   "completions":[
      {
         "data":{
            "text":" things –",
            "tokens":[
               {
                  "generatedToken":{
                     "token":"▁things",
                     "logprob":-0.0003219324571546167,
                     "raw_logprob":-0.47372230887413025
                  },
                  "topTokens":"None",
                  "textRange":{
                     "start":0,
                     "end":7
                  }
               },
               {
                  "generatedToken":{
                     "token":"▁–",
                     "logprob":-7.797079563140869,
                     "raw_logprob":-4.319167613983154
                  },
                  "topTokens":"None",
                  "textRange":{
                     "start":7,
                     "end":9
                  }
               }
            ]
         },
         "finishReason":{
            "reason":"length",
            "length":2
         }
      }
   ]
}

The response is a nested data structure containing information about the processed request, prompt, and completions. At the top level, the response has the following fields:

ID

A unique string id for the processed request. Repeated identical requests receive different IDs.

prompt

The prompt includes the raw text, the tokens with their log probabilities, and the top-K alternative tokens at each position, if requested. It has two nested fields:

  • text (string)
  • tokens (list of TokenData)

completions

A list of completions, including raw text, tokens, and log probabilities. The number of completions corresponds to the requested numResults. Each completion has two nested fields:

  • data, which contains the text (string) and tokens (list of TokenData) for the completion.
  • finishReason, a nested data structure describing the reason generation was terminated for this completion.

TokenData

The TokenData object provides detailed information about each token in both the prompt and the completions. It includes the following fields:

generatedToken:

The generatedToken field consists of two nested fields:

  • token: The string representation of the token.
  • logprob: The predicted log probability of the token after applying the sampling parameters as a float value.
  • raw_logprob: The raw predicted log probability of the token as a float value. For the indifferent values (namely, temperature=1, topP=1) we get raw_logprob=logprob.

topTokens

The topTokens field is a list of the top K alternative tokens for this position, sorted by probability, according to the topKReturn request parameter. If topKReturn is set to 0, this field will be null.

Each token in the list includes:

  • token: The string representation of the alternative token.
  • logprob: The predicted log probability of the alternative token as a float value.

textRange

The textRange field indicates the start and end offsets of the token in the decoded text string:

  • start: The starting index of the token in the decoded text string.
  • end: The ending index of the token in the decoded text string.