When will a language model stop generating text?

When will a language model stop generating texts? Let's find out!
Mentioned here: stop sequence, max tokens, min tokens

Language models were trained to generate text token by token and produce text completion in a scope that fits the prompt. Just like it learned how to generate text, it is also trained to know when to stop. Jurassic-2 tokenizers have a special token that signals when a reasonable generation should end by default. During pre-training, models were trained on text items containing stop sequence suffixes, and by doing so, they were adjusted to end sequences appropriately. However, there are times when the user will benefit from controlling the scope of the generated text, so how can you control the length of the sequence the models generate?

Set completion length parameters

You can set a lower and upper bound for the amount of tokens the model will generate (minTokens and maxTokens, respectively). Note that this is very artificial and may cause texts to be cut off in the middle of a sentence.

Set a stop sequence

You can set a stop sequence that tells the model to stop generating text. When the model generates this sequence, we can make it stop generating more text afterwards. A simple example: in order to generate a single sentence, you can set the character "." as a stop sequence, which will signal the model to stop once it is generated.

This is very handy when using a few-shot prompt: where we feed the model with a prompt containing a few examples. A few-shot prompt contains one or more completion examples preceding the actual text to complete. Upon generating the completion, the model may repeat the few-shot structures and generate additional examples unless otherwise instructed. Therefore, it is recommended to separate the examples with a custom sequence, such as “==”, and set it as a stop sequence. Once the text is generated, the model repeats the pattern and generates the separator, which indicates that text generation has been completed. See the example below: