Contextual Answers [RAG Engine]

Answer questions based on information in your document library using the RAG Engine

The Contextual Answers RAG Engine is a model that enables you to ask questions based on information stored in a document set, and get a natural language formatted answer.

Every AI21 account includes the ability to upload and maintain a personal document library. You can store an unlimited number of documents, up to 1GB total. Then use the RAG Engine to query your documents. The RAG Engine is carefully designed to limit answers to information based on the files in your library. The model will not use pre-training information to answer your questions or any other information outside the corpus of your library. You can additionally limit your requests to subsets of documents within your library, based on tags or other metadata that you have applied to your individual files.

Step 1: Upload your files

Upload your files (PDF, DOCX, HTML, and TXT) to the RAG Engine, where we offer free storage up to 1 GB (want more? Contact us: [email protected]). You can also integrate your organization’s data sources, such as Google Drive, Amazon S3, and others, to automatically sync documents with RAG Engine. To enable data source integration, contact us

In this example, we will upload three documents to the library. We show examples for both the Python SDK and direct HTTP REST request:

from ai21 import AI21Client

client = AI21Client(
    # This is the default and can be omitted

def upload_rag_file(path, labels=None, path_meta=None, url=None):
    file_id = client.library.files.create(

# Note that each file name (ignoring the path) must be unique.
# path and label parameters are used for filtering, as described below.
        path_meta='hr/UK',labels=['hr', 'policy'])
        path_meta='hr/US', labels=['hr', 'policy'])
import requests

def upload_library_file(file_path, path=None, labels=None, publicUrl=None):
  url = ROOT_URL + "library/files"
  file = {"file":open(file_path, "rb")}
  data = {
    "path": path,
    "labels": labels,
    "publicUrl": publicUrl
  response =
    headers={"Authorization": f"Bearer {AI21_API_KEY}"},
  print(f"FileID: {response.json()['id']}")
  # Apply label "hr" and path "/hr/UK" to allow filtering, described below.
                  path="/hr/UK", labels=["hr"])



You can upload, list, delete, search, and query your documents right in your browser using the Studio platform

Labeling and filtering requests

You can apply zero or more arbitrary text labels to each file, and limit your list, get, search, and query requests to files matching the specified labels.

Similarly, each file accepts an optional, arbitrary path-like label that acts as a hierarchical labeling system. For example, you might assign the path financial/USA to some files, financial/UK to other files, and then query financial/USA to get US financial info, financial/UK to get UK financial info, or financial/ to get all financial information. Path matching is simple prefix matching, and doesn't enforce any path-like syntax.

This can help you organize your filing system, while focusing your questions on a subset of documents.

Step 2: Ask a question

Once you have documents in your library, you can ask questions based on document content and get an answer in natural language.

The RAG Engine searches through the document library, filtering documents by any filtering parameters that you provide, looking for relevant content. When it finds relevant content, it ingests it and then generates an answer, along with a list of sources from the library used to generate the answer.

Let's ask a question about working from home:

client = AI21Client()

def query_library(query, labels=None, path=None):
    response = client.library.answer.create(

# Filter the question to documents with the case-sensitive label "hr"
query_library("How many days can I work from home in the US?",labels=["hr"])

The response is:

"In the US, employees can work from home up to two days per week."
  "answer":"In the US, employees can work from home up to two days per week.",
        "US employees can work from home up to two days per week."
        "UK employees can work from home up to two days per week."

Note that the full response returned from the model contains the sources used to generate the answer (see sources field).

If there is no answer

If the answer to the question is not in any of the documents, the model will indicate that setting answer_in_context to False and returning None for the answer field. For instance, if we will ask the following question:

response = client.library.answer.create(question="What's my meal allowance when working from home?")

The response will be empty and answer_in_context will be False. You can test for this as shown below:

if response.answer_in_context:
    print("The answer is not in the documents")

Step 3: Explore more options

If you have a large collection of documents and files, it can be helpful to refine your retrieval process. By adding labels and assigning paths to each document, you can narrow down your process and achieve more accurate results, ultimately saving time. We provide several options for that purpose:

  1. Search within a specific path in your library: Focus your search on a particular location within your knowledge base.
  2. Search only for documents with specific labels: Filter your search to include only documents that have been assigned certain labels.
  3. Search within a designated group of documents: Specify the document IDs of a particular set of files, allowing the model to perform the search exclusively within that group.

The latter option is particularly valuable when you have an existing retrieval mechanism in place and you have an idea of which document(s) might contain the answer, where those documents might be quite lengthy. Since our contextual-answers mechanism operates on document chunks rather than entire documents, you will obtain precise answers quickly and efficiently, without having to process the entire lengthy document.