Large Language Models

Introduction to the core of our product

Natural Language Processing (NLP) has seen rapid growth in the last few years since Large Language Models (LLMs) were introduced. These huge models are based on the Transformers architecture, which allows for the training of much larger and more powerful language models.

We divide LLMs into two main categories: Autoregressive and Masked LMs (language models). On this page, we will focus on Autoregressive LLMs, as our language models, the Jurassic-2 series, belong to this category.


The task: predict the next word

An autoregressive LLM is a neural network model that consists of billions of parameters. It's specifically trained on a vast collection of texts, with the primary aim of predicting the subsequent word based on the input text. This prediction process is repeated multiple times, with each predicted word being added to the original text, resulting in a full and coherent piece of text. This can range from full sentences and paragraphs to articles, books, and more. In terms of specific terminology used, the 'completion' refers to the output text generated by the model, while the 'prompt' is the input or original text given to the model.


Added value: knowledge acquisition

Consider the process of continuously reading all of Shakespeare's works as a means to learn a language. Over time, you wouldn't just be able to recall all of his plays and poems, but also replicate his unique writing style.

Similarly, by feeding our LLMs with a multitude of textual sources, they've developed a comprehensive understanding of English and a broad base of general knowledge.


Interacting with Large Language Models

LLMs are queried using natural language, also known as prompt engineering

Rather than writing lines of code and loading a model, you write a natural language prompt and pass it to the model as the input. For example:


Resource-intensive

Data, computation, and engineering resources are required for training and deploying large language models. LLMs, such as our Jurassic-2 models, play an important role here, providing access to this type of technology to academic researchers and developers.