Links
- Playground: Try it in the browser, with all the API parameters.
- API documentation: Learn how to use our RAG Engine APIs:
Library specifics
To see details such as supported file formats and max file sizes, see the file upload reference.Features
- Seamless integration between retrieval and generation: RAG Engine automatically integrates with many of our task-specific models, so you can surface search results or provide a grounded answer to a query based on your organizational data – all within a single API call. You can also connect RAG Engine with a foundation model like Jurassic – and use Semantic Search results within a prompt.
- Support for several file formats: The RAG Engine supports several different file types including PDF. The parser understands complex layouts, including tables.
- Built-in data source integration: You can integrate your organization’s data sources, such as Google Drive, Amazon S3, and others, to automatically sync documents with RAG Engine. To enable data source integration, contact us.
- Secure and appropriate access to documents: The RAG Engine adheres to organizational user and group permissions with a comprehensive approach that addresses the intricate needs of document management in modern, data-intensive environments.
Current limitations
- OCR functionality is not available at this time. Only PDF documents that contain a text layer are supported.
- Some PDFs may take longer to process, depending on their content.
How the RAG engine works
The RAG Engine comprises the following parts, each of which can be called individually:- Adocument library manager: Upload files into the library and parse the content into text and content. The library can read several different formats, including PDFs, and can understand complex structures such as tables.
- Indexing: When a document is uploaded, the engine parses it into segments by topic. Adjacent sections of the doc that are about the same topic are considered to be the same segment (up until a maximum size). Each segment has an embedding value that expresses the meaning of the segment in numerical form. The engine stores a list of segments and their embedding values.
- Pro tips: A document is actually segmented and rated in two different ways: using sparse evaluation, which is a simple list of keywords in the segment, and dense evaluation, which is a vector of 1024 values that give a more complex evaluation of what the segment is about (monkeys, pets, care and feeding, etc) as calculated by the embedding engine.
- Retrieval: The engine extracts the user’s question, calculates an embedding value for it that represents the meaning of the question, then looks through the index for segments that seem to be about the same topic. Matching segments are put into a candidate pool for the next step.
- Pro tip: There is a similarity threshold between the user’s question and the segment. The retrieval engine looks for segments within this threshold.
- Question answering: All segments in the candidate pool are combined into a prompt, along with the question, and sent to an LLM to generate an answer. (The prompt is basically, “here is some information and a question; generate an answer to the question using only the information given here; if you don’t know the answer, say ‘I don’t know’“)
Using the RAG engine
Step 1: Upload your files
Upload your files (PDF, DOCX, HTML, and TXT) to the RAG Engine, where each account gets free storage up to 1 GB. (Want more? Contact us: sales@ai21.com) You can also integrate your organization’s data sources, such as Google Drive, Amazon S3, and others, to automatically sync documents with RAG Engine. To enable data source integration, contact usTip
To upload, list, delete, and update your library, you can use either these API endpoints or use the AI21 playground web app.Labeling and filtering requests
File labels You can apply zero or more arbitrary text labels to each file, and later limit your list, get, search, and query requests to files matching specific labels. File paths Similarly, you can assign an optional, arbitrary path-like label to each file. This path-style label enables a hierarchical labeling system. For example, you might assign the pathfinancial/USA
to some files, financial/UK
to other files, and then limit your query to financial/USA
to get US financial info, financial/UK
to get UK financial info, or financial/
to get all financial information. Path matching is simple prefix matching, and doesn’t enforce or verify the path syntax.
This can help you organize your filing system, while focusing your questions on a subset of documents.
Step 2: Ask a question
Once you have documents in your library, you can ask questions based on document content and get an answer in natural language. The RAG Engine searches through the document library, filtering documents by any filtering parameters that you provide, looking for relevant content. When it finds relevant content, it ingests it and then generates an answer, along with a list of sources from the library used to generate the answer.Tip
To query files in code, use this endpoint. To query your library in your browser use the RAG Engine playground and select Ask your documents.sources
field).
If there is no answer
If the answer to the question is not in any of the documents, the response will have answer_in_context
set to False
and an assistant message saying “Answer not in document.”
answer_in_context
: