Jurassic 2.3.0 Release Announcement

The main focus of Jurassic 2.3.0 includes improvements in grounded generation, making it more reliable and less prone to hallucinations. The new version demonstrates increased performance in output based on a specified context, such as an article, document or report. Popular applications of grounded generation are observed in tasks such as contextual question answering, summarization, and description generation.

The new version is available on Jurassic Ultra and Mid models.

Improved Capabilities

Grounded Question Answering - The new version is much better at answering a question based on provided information. According to our evaluations, Jurassic 2.3.0 contains 18% less factuality issues overall. In particular, the new version is 87% more likely to acknowledge that the question cannot be answered based on the provided context, making it much more honest and reliable. For example, when providing the History of Target Corporation Wikipedia page as context and asking the model a question about initiatives taken in 2023, the model replies that it cannot provide an answer. The article’s content contains events up until 2022, so this is the correct answer. Prior versions of Jurassic did not admit that this and instead hallucinated an answer.
Another example is this HBR article on Generative AI provided as context. The following is an example of a question that the article correctly does not provide an answer to.

PROMPT
Context:
{{Article's content}}

answer the following question based only on the context:
What are the specific challenges or obstacles faced by companies in implementing generative AI in HR and legal functions?

COMPLETION
From the presented context, I was unable to find an answer

The model is also more reliable when the question can be answered by using the context.

PROMPT
Context:
{{Article's content}}

answer the following question based only on the context:
According to the Gartner survey in the third quarter of 2023, what percentage of executive leaders reported their organization is either piloting or has already implemented generative AI solutions?

COMPLETION
According to the Gartner survey, 45% of executive leaders reported that their organization is piloting or experimenting with generative AI, and another 10% said they’ve already gone live with solutions.

Summarization - The new version shows significant improvements in generating factual summarizations. The new model includes, on average, 33% less unfactual claims, making summarizations much more reliable.

Grounded Description Generation - The model makes fewer factuality mistakes when generating a business, website or product description.