Contextual Answers (RAG Engine)
Answer questions based solely on documents in your document library
This model is no longer supported. For better results, see Jamba 1.5.
The Contextual Answers RAG Engine enables you to ask natural language questions based on information stored in a document set and get a natural-language-formatted answer.
Every AI21 account includes the ability to upload and maintain a personal document library. You can store an unlimited number of documents, up to 1GB total size. Then use the RAG Engine to query your documents in two different ways:
- Search for documents that include specific keywords or cover specific topics to see relevant snippets
- Ask a question and have the latest foundation model generate an answer based solely on the information in your documents. If the information is not in those documents, the model will tell you so.
The RAG Engine is carefully designed to limit answers to information based on the files in your library. The model will not use pre-training information to answer your questions or any other information outside the corpus of your library. You can additionally limit your requests to subsets of documents within your library, based on tags or other metadata that you have applied to your individual files.
Links
- Playground: Try it in the browser, with all the API parameters.
- API documentation: Learn how to use our RAG Engine APIs:
- RAG Engine overview
Step 1: Upload your files
Upload your files (PDF, DOCX, HTML, and TXT) to the RAG Engine, where each account gets free storage up to 1 GB. (Want more? Contact us: [email protected])
You can also integrate your organizationโs data sources, such as Google Drive, Amazon S3, and others, to automatically sync documents with RAG Engine. To enable data source integration, contact us
Tip
To upload, list, delete, and update your library, you can use either these API endpoints or use the AI21 playground web app.
In this example, we will upload three documents to the library. We show examples for both the Python SDK and direct HTTP REST request:
from ai21 import AI21Client
client = AI21Client(
# This is the default and can be omitted
api_key=os.environ.get("AI21_API_KEY"),
)
def upload_rag_file(path, labels=None, path_meta=None, url=None):
file_id = client.library.files.create(
file_path=path,
labels=labels,
path=path_meta,
public_url=url
)
print(file_id)
# Note that each file name (ignoring the path) must be unique.
# path and label parameters are used for filtering, as described below.
upload_rag_file("/Users/dafna/uk_employee_policy.pdf",
path_meta='hr/UK',labels=['hr', 'policy'])
upload_rag_file("/Users/dafna/us_employee_policy.md",
path_meta='hr/US', labels=['hr', 'policy'])
upload_rag_file("/Users/dafna/it_security.html",
labels=['security'])
import requests
ROOT_URL = "https://api.ai21.com/studio/v1/"
def upload_library_file(file_path, path=None, labels=None, publicUrl=None):
url = ROOT_URL + "library/files"
file = {"file":open(file_path, "rb")}
data = {
"path": path,
"labels": labels,
"publicUrl": publicUrl
}
response = requests.post(
url=url,
headers={"Authorization": f"Bearer {AI21_API_KEY}"},
data=data,
files=file
)
print(f"FileID: {response.json()['id']}")
# Apply label "hr" and path "/hr/UK" to allow filtering, described below.
upload_library_file("/Users/dafna/Documents/employee_handbook_uk.txt",
path="/hr/UK", labels=["hr"])
Labeling and filtering requests
File labels
You can apply zero or more arbitrary text labels to each file, and later limit your list, get, search, and query requests to files matching specific labels.
File paths
Similarly, you can assign an optional, arbitrary path-like label to each file. This path-style label enables a hierarchical labeling system. For example, you might assign the path financial/USA
to some files, financial/UK
to other files, and then limit your query to financial/USA
to get US financial info, financial/UK
to get UK financial info, or financial/
to get all financial information. Path matching is simple prefix matching, and doesn't enforce or verify the path syntax.
This can help you organize your filing system, while focusing your questions on a subset of documents.
Step 2: Ask a question
Once you have documents in your library, you can ask questions based on document content and get an answer in natural language.
The RAG Engine searches through the document library, filtering documents by any filtering parameters that you provide, looking for relevant content. When it finds relevant content, it ingests it and then generates an answer, along with a list of sources from the library used to generate the answer.
Tip
To query files in code, use this endpoint. To query your library in your browser use the RAG Engine playground and select Ask your documents.
Let's ask a question about working from home:
client = AI21Client()
def query_library(query, labels=None, path=None):
response = client.library.answer.create(
question=query,
path=path,
labels=labels
)
print(response.answer)
# Filter the question to documents with the case-sensitive label "hr"
query_library("How many days can I work from home in the US?",labels=["hr"])
The response is:
"In the US, employees can work from home up to two days per week."
{
"id":"709f5db4-5834-ed03-c161-ebef2f2a2543",
"answerInContext":true,
"answer":"In the US, employees can work from home up to two days per week.",
"sources":[
{
"fileId":"5143f17d-626d-4a1f-8e66-a4e7e129d238",
"name":"hr/US/employee_handbook_us.txt",
"highlights":[
"US employees can work from home up to two days per week."
],
"publicUrl":null
},
{
"fileId":"d195f8a9-a8fe-4f4e-96fb-342246d55f53",
"name":"/hr/UK/employee_handbook_uk.txt",
"highlights":[
"UK employees can work from home up to two days per week."
],
"publicUrl":null
}
]
}
Note that the full response returned from the model contains the sources used to generate the answer (see the sources
field).
If there is no answer
If the answer to the question is not in any of the documents, the response will have answer_in_context
set to False
and an empty answer
field. For instance, if we will ask the following question:
response = client.library.answer.create(question="What's my meal allowance when working from home?")
The response will be empty and answer_in_context
will be False. You can test for this as shown below:
if response.answer_in_context:
print(response.answer)
else:
print("The answer is not in the documents")
Improving your results
Working with PDFs
When analyzing PDFs, the recommended approach is to use AI21's native PDF support. If you are using a custom parser, ensure that your parser is accurately parsing tables and other information, as Contextual Answers can be sensitive to incorrectly parsed input data. Note that, Contextual Answers also supports .docx, .html and standard .txt files.
When analyzing tables, we recommend passing the table contents as JSONL, where each row has the key (i.e. column name) and the value (i.e. the corresponding row entry). Note that for smaller tables, or for tables embedded within a larger text, this step frequently can be skipped, as Contextual Answers will generally be able to surface answers from the raw table.
Benchmarking your answers
When evaluating Contextual Answers (as when evaluating any model) the evaluation process is crucial for assessing the performance. Follow these general steps to refine your evaluation methods:
Create a Test Set: Begin with 10 or more questions, each with their own contexts and "golden answers." This helps in establishing a baseline for measuring improvements. We recommend having a diverse set of question types.
Ensure Accuracy of your Test Set: Verify that the gold answers are correct and indeed contained within the given context. While you can use answers that are sourced from other large language models, it is essential to ensure that the responses are actually accurate and correct and are contained within the context.
Comprehensive Evaluation: Evaluate not only the True Positive instances (correct answers) but also consider True Negative instances (correctly identified as "Answer not in doc"). This ensures a balanced evaluation of the model.
Evaluation Correctness of the response: Evaluation can be done either manually or automatically (for example, by using an LLM). However, if an LLM is used, care should be taken to avoid biases in evaluation, since LLMs generally prefer responses an LLM of a similar type. It is recommended that human evaluation be used either entirely or at least to evaluate the LLM classification of correctness of the Contextual Answers response.
Updated about 1 month ago