Custom Models

Using AI21 Studio, you can train and query your own custom versions of our base models. Custom models are fine-tuned for optimal performance on a training set of examples representing a specific task.

Custom models can be trained to perform virtually any language task. Use cases include generating marketing copy, powering chatbots and assisting creative writing.

You can train custom models based on either J2-Mid or J2-Light (soon also J2-Ultra) and pick your desired cost/quality trade-off.

Why Train a Custom Model?

🧵🪡✂ Best results tailored to your specific use case

Given a sufficient number of training examples, custom J2 models exceed the quality attainable with general purpose models and prompt engineering. For many use cases, you can expect custom models to begin outperforming prompt engineering with as few as 50-100 examples. To learn more, read our case study blog post, where we address a specific language task, using both general purpose models and custom models.

Furthermore, custom models derive their quality from the training data you provide; adding more, higher quality examples will improve results. This means you can continuously refine your custom model by curating high-quality data for your task.

🚀🚀 Faster performance

The training process bakes the task-specific behavior into the custom model. This means your prompts no longer need to include elaborate instructions and examples designed to guide a general purpose model to perform the desired task. Instead, your prompts only need to include the specific input you'd like to handle, reducing the amount of text that gets processed and decreasing latency.

💪🦾 Adversarial robustness

One potential safety risk of large language models is deliberate misuse by malicious users of your application, exploiting its access to Jurassic-2 to generate text for their malicious purposes. Adversaries may attempt to achieve this via “prompt injection”, where the end-user’s input text is crafted to alter the normal behavior of the model. Custom models are less susceptible to such attacks than general purpose models, offering a significant safety advantage when deployed in production. For more information, read our case study blog post.

Sounds great! How do I do it?

There are two parts to the training process:

  1. Build a dataset.

  2. Train a custom model based on this dataset.