Introduction to the core of our product
['▁I▁want▁to', '▁break', '▁free', '.']
The underscores are used to replace whitespace in a token in the dictionary, and are stripped out when the response is generated. Note how the first few words are combined as a semantic unit into a single token, but the last two words are not. Also note that the final punctuation mark is its own token, and a whitespace mark is added to the start of the sentence, because the tokenizer adds a dummy space at the start of each line for consistency.
[eom]
) that signals when a reasonable generation should end by default. During pre-training, models were trained on text items containing stop sequence suffixes, and by doing so, they were adjusted to end sequences appropriately.
You can provide additional guidance to the model about when to stop generating text:
max_tokens
parameter. (Good practice) Set max_tokens as a firewall to prevent the rare cases where a model might go astray and generate a very long answer. With very high temperatures, the model is more likely to generate very long answers. Set this value to well over your required maximum response length, just as a precaution.