Manage the RAG document library

Overview

These endpoints are used to manage files in your RAG Engine file library. The library can hold an unlimited number of files, but a maximum of 1GB of files.

You can use the RAG Engine on these documents to enable multi-document contextual answers or for semantic search within the library to implement your own search engine.

File metadata

Files support the following metadata:

  • fileId  uuid   The unique identifier of the file, generated by AI21.
  • name string   The name of the file specified by you.
  • path string   An arbitrary file-path-like string you can assign to indicate the content of a file. It has nothing to do with a location in storage or the source location, or the path variable used when the file For example, pets/fish or pets/dogs. Then, when searching your library, you can filter files by full path or path prefix. So to search only files in the "dog folder", filter by the path /pets/dogs. To search all files in the pet "folder", filter your search by the path pets/ when searching by path. There isn't a restriction on whether a path must start or end with a / mark, but be consistent in your usage, and all matches are prefix matching, not substring matching. So dog/ matches dog/ and dog/setter but not pets/dog/.
  • fileType string   The type of the file.
  • sizeBytes int   The size of the file in bytes.
  • labels list of strings   The labels associated with the file. You can apply arbitrary string labels to your files and limit queries to files with one or more labels. Similar to paths, but labels do not prefix match. Labels are case-sensitive. You can specify a maximum of 20 labels per account.
  • publicUrl string   The public URL of the file, specified by you. This URL is not validated by AI21 or used in any way. It is strictly a piece of metadata that you can optionally attach to a file.
  • createdBy  uuid   The identifier of the user who uploaded the file.
  • creationDate  date   The date when the file was uploaded.
  • lastUpdated  date   The last update date of the file in the library.
  • status string   The status of the file.

Filtering file results

When you create or update a file, you can optionally apply path and/or label values to a file. You can later search, query, or list based on matching paths or labels. This can be useful to query subsets of documents. For example, you can assign a path of either /financial/USA or /financial/UK to certain documents and later query only US financial documents, only UK financial documents, or query all financial documents by specifying /financial/ as the path.

Library management methods

Upload a file to the library

Upload files to use for RAG Engine document searches. You can assign metadata to your files to limit searches to specific files by file metadata. There is no bulk upload method; files must be loaded one at a time.

  • Max number of files: No limit. The playground limits bulk uploads to 50 files per request.
  • Max total library size: 1 GB
  • Max file size: 100 MB
  • Supported file types: PDF, DocX, HTML, TXT

Request

POSThttps://api.ai21.com/studio/v1/library/files

Request body fields

  • file   string   →Required Raw file bytes. Every uploaded file must have a unique file name, not including the local file path. Different path parameter values do not enable uploading files with the same name. Specifying file + path or file + labels will find only files with both the specified name and path/label. SDK NOTE In the SDK this parameter is replaced with file_path, which is the local string path and name of the file, not the file bytes.
  • path   string   →Optional
    Arbitrary file-path-like string that indicates the content of the file. Can be used to limit the scope of files to use in a query or search request. See above to learn more about paths. Paths are case-sensitive. Specifying path + file or path + labels will find only files with both the specified path and name/label.
  • labels   array of strings   →Optional
    Arbitrary string labels that describe the contents of this file. Labels are case-sensitive. There is a maximum of 20 labels per account. Specifying labels + path or labels + name will find only files with both any of the specified labels AND the specified name/path.
  • publicUrl   string   →Optional
    A public URL associated with the file, if any. Only used as metadata, to indicate the location of the source file. For example, if implementing a search engine against a website, specifying a URL for each uploaded file is a simple way to present the link to the file in the search results presented to the user. (Tip: You can also provide a file path or any arbitrary string: URL validity is not checked.)
client = AI21Client(api_key=<<AI21_API_KEY>>)

def upload_rag(path, labels, path_meta):
    response = client.library.files.create(
        file_path=path,
        labels=labels,
        path=path_meta
    )
    print(response)

import requests
ROOT_URL = "https://api.ai21.com/studio/v1/"


def upload_library_file(file_path, path=None, labels=None, publicUrl=None):
  url = ROOT_URL + "library/files"
  file = {"file":open(file_path, "rb")}
  data = {
    "path": path,
    "labels": labels,
    "publicUrl": publicUrl
  }
  response = requests.post(
    url=url,
    headers={"Authorization": f"Bearer {AI21_API_KEY}"},
    data=data,
    files=file
  )
  print(f"FileID: {response.json()['id']}")
  
upload_library_file("/Users/dafna/Documents/employee_handbook_uk.txt",
                  path="/hr/UK", labels=['hr'])
import fetch from 'node-fetch';
import fs from 'fs';
import FormData from 'form-data';

const url = "https://api.ai21.com/studio/v1/library/files";
const API_KEY = "your-api-key"; // replace with your API key

const file = fs.createReadStream("/Users/user/Desktop/folder/koala.txt");
const formData = new FormData();

formData.append('file', file);
formData.append('path', 'desktop/folder');
formData.append('labels', JSON.stringify(["cuties", "furry_animals"]));
formData.append('publicUrl', 'https://en.wikipedia.org/wiki/Koala');

fetch(url, {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
  },
  body: formData
}).then(res => res.json())
  .then(json => console.log(json))
  .catch(err => console.error('error:' + err));

Responses

Response 200

A unique identifier for the uploaded file. Use this later to request, modify, or delete the file. You don't need to store the value though, as it is returned along with all file information in a GET /files request. Type: UUID. Example: da13301a-14e4-4487-aa2f-cc6048e73cdc // file-uuid

Error: Unsupported document type 422

Error message:

{
    "detail": "Invalid file type: image/png. Supported file types are: text/plain, text/html, application/docx, application/pdf"
}

Error: Same file name, same path 422

Error message:

{
  "detail": "File: {fileName} already exists",
  "suggestion": "To override the file content, delete it first using the DELETE endpoint"
}
List library files

Retrieve a list of documents in the user's library. Optionally specify a filter to find only files with matching labels or paths. This method returns only metadata about files; to download a file, call GET .../files/{file_id}/download

When specifying qualifiers with your request, only files that match all qualifiers will be returns. So, for example, if you specify label='financial' and status='UPLOADED', only files with the label "financial" AND status UPLOADED will be returned.

Request

GET https://api.ai21.com/studio/v1/library/files

Request path parameters

The following optional parameters can be used to filter your results.

  • name   string   →Optional
    The full name of the uploaded file, without any path parameters. So: "mydoc.txt", not "/users/benfranklyn/Documents/mydoc.txt". Does not match name substrings, so "doc.txt" does not match "mydoc.txt".
  • path   string   →Optional
    Return only files with a path value that is a full match or starts with this string. So specifying "financial/" will match documents with "financial/taxes" but not documents with "money/financial/taxes".
  • status   string   →Optional
    Status of the file in the library. Supported values:
    • DB_RECORD_CREATED
    • UPLOADED
    • UPLOAD_FAILED
    • PROCESSED
    • PROCESSING_FAILED
  • label   array[string]   →Optional
    Return only files with this label. Label matching is case-sensitive, and will not match substrings.

Pagination:
By default, the endpoint returns up to 1000 files. Pagination can be controlled using the following parameters:

  • offset integer   →Optional
    The number of files to skip.
  • limit integer   →Optional
    The number of files to retrieve (maximum 1,000, default 1,000).
import os
os.environ["AI21_API_KEY"] = <YOUR_API_KEY>

client = AI21Client()
def get_files():
    # Just return the first 10 files
    response = client.library.files.list(
        offset=0,
        limit=10,
        status="PROCESSED" # Apply 
    )
    print(response)

## Response
[
  {
    file_id='44fwe3-ce35-4b6b-8e03-631151eae23b',
    name='schnauzers.md',
    file_type='text/markdown',
    size_bytes=6986,
    created_by='750592e4-faa5-4944-8781-7c51e596699b',
    creation_date='2024-06-10',
    last_updated='2024-06-10',
    status='PROCESSED',
    path='dogs/schnauzers',
    labels=['mobile','other'],
    public_url='https://example.com/petstore/schnauzers',
    error_code=None,
    error_message=None
  },
   ...
]
ROOT_URL = "https://api.ai21.com/studio/v1/"

def list_files():
  url = ROOT_URL + "library/files"
  params = {
    "offset": 0,  # Optional: Adjust as needed
    "limit": 100,  # Optional: Adjust as needed
    "status": "PROCESSED"
   }
  response = requests.get(
    url=url,
    headers={"Authorization": f"Bearer {AI21_API_KEY}"},
    params=params
  )
  print(response.text)
import fetch from 'node-fetch';

const url = "https://api.ai21.com/studio/v1/library/files";
const API_KEY = "your-api-key"; // replace with your API key

fetch(url, {
  method: 'GET',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
  }
}).then(res => res.json())
  .then(json => console.log(json))
  .catch(err => console.error('error:' + err));

Responses

Response 200

A successful response returns an array of file metadata items.

Get file metadata by file ID

Get metadata about a specific file by file ID. The file ID is generated by AI21 when you upload the file.

Request

GET https://api.ai21.com/studio/v1/library/files/{file_id}

import os
os.environ["AI21_API_KEY"] = <YOUR_API_KEY>

client = AI21Client()

def get_file_by_id(id):
    response = client.library.files.get(
        file_id=id
    )
    print(response.to_json())

## Response
{
  "fileId":"855bf8f4-cc45-4ba3-9139-f3326000e86a",
  "name":"test.txt",
  "fileType":"text/plain",
  "sizeBytes":14,
  "createdBy":"750592e4-a224-4944-8781-7c51e596699b",
  "creationDate":"2024-05-06",
  "lastUpdated":"2024-05-06",
  "status":"PROCESSED",
  "path":null,
  "labels":[
    
  ],
  "publicUrl":null,
  "errorCode":null,
  "errorMessage":null
}
import requests

url = "https://api.ai21.com/studio/v1/library/files/{file_id}"
headers = {"Authorization": f"Bearer API_KEY"}

requests.get(url, headers=headers)
import fetch from 'node-fetch';

const DOCUMENT_ID = "your-document-id"; // replace with your document id
const API_KEY = "your-api-key"; // replace with your API key
const url = `https://api.ai21.com/studio/v1/library/files/${DOCUMENT_ID}`;

fetch(url, {
  method: 'GET',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
  }
}).then(res => res.json())
  .then(json => console.log(json))
  .catch(err => console.error('error:' + err));

Responses

Response 200

A successful response returns an array of file metadata items.
errorCode int   The error code, if any.
errorMessage string   The error message, if any.

Response 422

No matching ID is found.

Get file download link

Get a link used to download the file contents from the library.

Request

GET https://api.ai21.com/studio/v1/library/files/{file_id}/download

import requests

def get_file_download_link(file_id):
  url = f"{ROOT_URL}library/files/{file_id}/download"
  response = requests.get(
    url=url,
    headers={"Authorization": f"Bearer {AI21_API_KEY}"}
  )
  print("Download link: ", response.text)

Responses

Response 200

A successful response returns an array of file metadata items.
errorCode int   The error code, if any.
errorMessage string   The error message, if any.

Response 422

No matching ID is found.

Update file metadata

Update the specified parameters of a specific document in the user's library. This operation currently supports updating the publicUrl and labels parameters.

This operation overwrites the specified items with the new data you provide. If you wish to add new labels to the labels list without removing the existing ones, you must submit a labels list that includes both the current and new labels.

For instance, if the current labels are "Label A" and "Label B", and you wish to add "New Label C" and "New Label D" to the list, you must specify "labels": ["Label A", "Label B", "New Label C", "New Label D"].

Request

PUT https://api.ai21.com/studio/v1/library/files/{file_id}

Request body

  • publicUrl   string   →Optional
    The updated public URL of the document.
  • labels   array of strings   →Optional
    The updated labels associated with the file. Separate multiple labels with commas.
import os
os.environ["AI21_API_KEY"] = <YOUR_API_KEY>

from ai21 import AI21Client
from ai21 import errors as ai21_errors
from ai21 import AI21APIError

client = AI21Client()
def update_file_metadata(file_id):
    try:
        response = client.library.files.update(
            file_id=file_id,
            labels=["reptiles","boring"], # Replaces, not adds to, existing labels
            public_url="https://example.com/petstore/iguanas"
        )
    except ai21_errors.AI21ServerError as e:
        print("Error: server could not be reached")
        print(e.details)
    except ai21_errors.TooManyRequestsError as e:
        print("A 429 status code was returned. Slow down on the requests")
    except AI21APIError as e:
        print(f"Another error: {e} {e.status_code} For more error types see ai21.errors")
    else:
        print("File updated")
import requests

url = "https://api.ai21.com/studio/v1/library/files/{file_id}"
headers = {"Authorization": f"Bearer API_KEY"}
data = {
    "labels": ["Label A", "Label B", "New Label C", "New Label D"], 
    "publicUrl": "www.updated-url.com"
}

requests.put(url, headers=headers, json=data)
import fetch from 'node-fetch';

const DOCUMENT_ID = "your-document-id"; // replace with your document id
const API_KEY = "your-api-key"; // replace with your API key
const url = `https://api.ai21.com/studio/v1/library/files/${DOCUMENT_ID}`;

let data = {
  "labels": ["Label A", "Label B", "New Label C", "New Label D"],
  "public_url": "www.updated-url.com"
};

fetch(url, {
  method: 'PUT',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify(data)
}).then(res => res.json())
  .then(json => console.log(json))
  .catch(err => console.error('error:' + err));

Responses

If the update is successful, the server will return an HTTP status code 200 with no body content. If the document ID does not exist, a 422 error message will be returned.

Delete file

Delete the specified file from the library.

Restrictions:
Files in PROCESSING status cannot be deleted. Attempts to delete such files will result in a 422 error.

Request

DELETEhttps://api.ai21.com/studio/v1/library/files/{file_id}

import requests

url = "https://api.ai21.com/studio/v1/library/files/{file_id}"
headers = {"Authorization": "Bearer API_KEY"}

requests.delete(url, headers=headers)
import fetch from 'node-fetch';

const url = "https://api.ai21.com/studio/v1/library/files/DOCUMENT_ID";
const headers = {"Authorization": "Bearer API_KEY"};

fetch(url, {
  method: 'DELETE',
  headers: headers
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));

Responses

Response 200

Successful deletion. No body content.

Response 422

File by this ID does not exist