Summarize by Segment

Takes a document (raw text or URL), performs smart segmentation and returns summary and highlights per segment.

AI21 Studio's Summarize by Segment model splits text into topical segments and provides a summary of each segment. The text can be either input directly, or you can specify a URL for the segmenter to retrieve. The model can parse text, HTML, and PDF documents. This feature is useful for enabling users to read the original text faster and more efficiently. They can skim where possible and pay more attention where needed. For the full API functionality see the API Reference.

Features

Seg ment ation

Initially, the model breaks down the input text into segments. Instead of merely using punctuation and newlines to divide the text, this identifies distinct topics that work well together. These topics will form a coherent text piece. This achieves two main goals:

  • Full coverage: the summary will not miss any key points in the document.
  • Easy integration: each summary is accompanied by the original text/html segment. This allows for seamless and quick integration into your product.

In summary

The summaries are returned per eligible segment (above 40 words). For shorter segments, the summary will be empty (null). The summaries are truthful to the original segment and will not contain any new (and false) information.

The input text should contain at least 40 words and no more than 50,000 characters. This translates to roughly 10,000 words, or an impressive 40 pages! When it comes to URLs, this limitation only applies for summarizable text (after parsing, preprocessing, etc.).

Give me the highlights

In addition to summaries, for each eligible segment the model will provide highlights, which are extractive keypoints from the segment. Those highlights are the basis for the generated summary, ensuring summaries which are faithful to the original text.

Different types

In addition to working with free text, this API can also work directly with your favorite (or least favorite) webpage URLs! No need to spend time and effort scraping text yourself - just input the required URL and let the summarization begin.

Note: if the webpage you are trying to summarize is behind a paywall or restricted access, your call will fail and will result in an error.

Summaries with laser focus

Save your users time by providing focused summaries on a topic of their choice with our enhanced guided summary feature. By specifying keywords, phrases, or topics, the API will produce only the relevant summaries, allowing your users faster browsing and better text consumption.

Examples

With just a few lines of code, you can provide your users with high quality summaries and highlights. Whether they are students who need to perform a massive literature review, journalists who need to sift through large amounts of information, or your average Joe who just needs to get the gist of the long report they have to read. Following are some examples illustrating both use cases and special features you can take advantage of with this API.

Summarization of Articles

There are more blog posts than there are people in the world (at least as of 2023). Reading them all would be impossible. However, with the enhanced reading experience this particular summarize model provides, with short and on-point segment summarizations, your users will have a fighting chance.

You can use the following code in order to summarize this blog post:

import requests
import json

url = "https://api.ai21.com/studio/v1/summarize-by-segment"

payload = json.dumps({
  "sourceType": "URL",
  "source": "https://www.ai21.com/blog/ai21-bigquery"
})
headers = {
  'Authorization': 'bearer <token>',
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

The response is a JSON object which includes the segments from the original text, as well as summary and highlights per segment:

{
    “id”: “041c3c47-0cf2-0c64-c1a7-27dc804629ed”,
    “segments”: [
        {
            “summary”: “Today, we announced our collaboration with Google Cloud, offering our state-of-the-art generative AI capabilities on top of BigQuery. Customers will be able to perform complex language tasks on their organizational data natively within BigQuery.“,
            “segmentText”: “Today marks an exciting milestone as we announce our collaboration with Google Cloud, offering our state-of-the-art generative AI capabilities on top of BigQuery. \n\nOpening its annual Google Next conference, Google Cloud announced the integration of our Contextual Answers language model with its flagship BigQuery product. We are proud to be one of the first partners to integrate generative AI features on top of BigQuery, Google Cloud’s fully-managed cloud data warehouse with built-in ML.\n\nThe integration will enable customers to perform complex language tasks on their organizational data natively within BigQuery. For example, retail customers can easily extract quantitative insights from product reviews, analyze attributes of successful product descriptions, and identify areas of improvement within support tickets. Financial services companies can use the integration to quickly analyze reports.“,
            “segmentHtml”: “<p>Today marks an exciting milestone as we announce our collaboration with Google Cloud, offering our state-of-the-art generative AI capabilities on top of BigQuery. </p><span>&nbsp;&nbsp;</span><p>Opening its annual Google Next conference, Google Cloud announced the integration of our <a href=\“https://www.ai21.com/blog/introducing-contextual-answers\“>Contextual Answers</a> language model with its flagship BigQuery product. We are proud to be <strong>one of the first partners to integrate generative AI features on top of BigQuery</strong>, Google Cloud&#8217;s fully-managed cloud data warehouse with built-in ML.</p><span>&nbsp;&nbsp;</span><p>The integration will enable customers to perform complex language tasks on their organizational data natively within BigQuery. For example, retail customers can easily extract quantitative insights from product reviews, analyze attributes of successful product descriptions, and identify areas of improvement within support tickets. Financial services companies can use the integration to quickly analyze reports.</p>“,
            “segmentType”: “normal_text”,
            “hasSummary”: true,
            “highlights”: [
                {
                    “text”: “Today marks an exciting milestone as we announce our collaboration with Google Cloud, offering our state-of-the-art generative AI capabilities on top of BigQuery.“,
                    “startIndex”: 0,
                    “endIndex”: 162
                },
                {
                    “text”: “The integration will enable customers to perform complex language tasks on their organizational data natively within BigQuery.“,
                    “startIndex”: 494,
                    “endIndex”: 620
                }
            ]
        },
        {
            “summary”: null,
            “segmentText”: “The integration continues our long-standing partnership with Google Cloud. As users of its custom AI accelerators and Tensor Processor Units (TPUs), we’ve benefited from Google Cloud’s purpose-built AI infrastructure and expertise to train and serve our advanced LLMs.“,
            “segmentHtml”: “<p>The integration continues our long-standing partnership with Google Cloud. As users of its custom AI accelerators and Tensor Processor Units (TPUs), we&#8217;ve benefited from Google Cloud&#8217;s purpose-built AI infrastructure and expertise to train and serve our advanced LLMs.<br/></p>“,
            “segmentType”: “normal_text_short”,
            “hasSummary”: false,
            “highlights”: []
        },
        {
            “summary”: “Google Cloud’s vast portfolio of impressive AI hardware is why AI21 Labs chose to work with the company, and today’s news marks an important step forward.“,
            “segmentText”: “”Google Cloud’s vast portfolio of impressive AI hardware is why we chose to work with the company, and we’re proud to now double down on expanding our generative AI solutions alongside them,” said Ori Goshen, Co-CEO and co-founder, AI21 Labs. “Our partnership with Google Cloud has had a profound effect on our work to advance the possibilities of artificial intelligence and natural language processing, and today’s news of our deepened collaboration marks an important step forward in that journey.“”,
            “segmentHtml”: “<p>&#8220;Google Cloud&#8217;s vast portfolio of impressive AI hardware is why we chose to work with the company, and we&#8217;re proud to now double down on expanding our generative AI solutions alongside them,&#8221; said Ori Goshen, Co-CEO and co-founder, AI21 Labs. &#8220;Our partnership with Google Cloud has had a profound effect on our work to advance the possibilities of artificial intelligence and natural language processing, and today&#8217;s news of our deepened collaboration marks an important step forward in that journey.&#8221;</p>“,
            “segmentType”: “normal_text”,
            “hasSummary”: true,
            “highlights”: [
                {
                    “text”: “”Google Cloud’s vast portfolio of impressive AI hardware is why we chose to work with the company”,
                    “startIndex”: 0,
                    “endIndex”: 97
                },
                {
                    “text”: “AI21 Labs.“,
                    “startIndex”: 232,
                    “endIndex”: 242
                },
                {
                    “text”: “and today’s news of our deepened collaboration marks an important step forward in that journey.“”,
                    “startIndex”: 405,
                    “endIndex”: 501
                }
            ]
        },
        {
            “summary”: “AI21 Labs is taking advantage of Google Cloud’s leading infrastructure to bring generative AI to businesses in every industry.“,
            “segmentText”: “”AI21 Labs is a leading generative AI startup that is taking advantage of the incredible performance that Google Cloud’s leading infrastructure offers,” said Thomas Kurian, CEO, Google Cloud. “Our new BigQuery integrations are a great example of how we are working together to bring the value of generative AI to businesses in every industry.“”,
            “segmentHtml”: “<p>&#8220;AI21 Labs is a leading generative AI startup that is taking advantage of the incredible performance that Google Cloud&#8217;s leading infrastructure offers,&#8221; said Thomas Kurian, CEO, Google Cloud. &#8220;Our new BigQuery integrations are a great example of how we are working together to bring the value of generative AI to businesses in every industry.&#8221;</p>“,
            “segmentType”: “normal_text”,
            “hasSummary”: true,
            “highlights”: [
                {
                    “text”: “”AI21 Labs is a leading generative AI startup that is taking advantage of the incredible performance that Google Cloud’s leading infrastructure offers”,
                    “startIndex”: 0,
                    “endIndex”: 150
                },
                {
                    “text”: “”Our new BigQuery integrations are a great example of how we are working together to bring the value of generative AI to businesses in every industry.“”,
                    “startIndex”: 192,
                    “endIndex”: 343
                }
            ]
        },
        {
            “summary”: “By leveraging generative AI within BigQuery, businesses can conduct quantitative analyses of their unstructured data at scale using natural language, helping them make better-informed decisions.“,
            “segmentText”: “By unlocking the power of generative AI within BigQuery, businesses can conduct quantitative analyses of their organization’s unstructured data at scale using natural language. The Contextual Answers model enables customers to easily gain deep insights that are otherwise inaccessible or difficult to extract, helping them make better-informed decisions. We can’t wait to see how customers leverage generative AI capabilities within their BigQuery environment.\n\nLearn more about Contextual Answers here.“,
            “segmentHtml”: “<p>By unlocking the power of generative AI within BigQuery, businesses can conduct quantitative analyses of their organization&#8217;s unstructured data at scale using natural language. The Contextual Answers model enables customers to easily gain deep insights that are otherwise inaccessible or difficult to extract, helping them make better-informed decisions. We can&#8217;t wait to see how customers leverage generative AI capabilities within their BigQuery environment.</p><span>&nbsp;&nbsp;</span><p><em>Learn more about Contextual Answers </em><a href=\“https://www.ai21.com/blog/introducing-contextual-answers\“>here</a>.</p>“,
            “segmentType”: “normal_text”,
            “hasSummary”: true,
            “highlights”: [
                {
                    “text”: “By unlocking the power of generative AI within BigQuery, businesses can conduct quantitative analyses of their organization’s unstructured data at scale using natural language.“,
                    “startIndex”: 0,
                    “endIndex”: 176
                },
                {
                    “text”: “helping them make better-informed decisions.“,
                    “startIndex”: 310,
                    “endIndex”: 354
                }
            ]
        },
        {
            “summary”: null,
            “segmentText”: “‍Learn more about Google Cloud’s BigQuery here.“,
            “segmentHtml”: “<p>&#8205;<em>Learn more about Google Cloud&#8217;s BigQuery </em><a href=\“https://cloud.google.com/bigquery#section-1\“>here</a>.</p>“,
            “segmentType”: “other”,
            “hasSummary”: false,
            “highlights”: []
        }
    ]
}

Guided summary - a summary tailored to your interests

Have you ever read an entire report only to find the only section you care about hidden in the middle, or worse, scattered around in small bytes? Wouldn’t it be nice if your users didn’t have to spend all this time searching for what they care about, and instead got the gist of exactly what they need? By simply entering a keyword, users can obtain a summary customized to their preferences with our guided summary feature. It will not produce a subset of the original summary, but a different, more keyword-focused summary.

As an example, let's take a look at Amazon's 2022 Letter to Shareholders. Using /summarize-by-segment, we will produce a summary guided by the word "publishing". You can see the results in the table below:

Segment TextSummary
In the 25 years I’ve been at Amazon, there has been constant change, much of which we’ve initiated ourselves.

When I joined Amazon in 1997, we had booked $15M in revenue in 1996, were a books-only retailer, did not have a third-party marketplace, and only shipped to addresses in the US.

Today, Amazon sells nearly every physical and digital retail item you can imagine, with a vibrant third-party seller ecosystem that accounts for 60% of our unit sales, and reaches customers in virtually every country around the world.

Similarly,building a business around a set of technology infrastructure services in the cloud was not obvious in 2003 when we started pursuing AWS, and still wasn’t when we launched our first services in 2006.

Having virtually every book at your fingertips in 60 seconds, and then being able to store and retrieve them on a lightweight digital reader was not “a thing” yet when we launched Kindle in 2007, nor was a voice-driven personal assistant like Alexa (launched in 2014) that you could use to access entertainment, control your smart home, shop,and retrieve all sorts of information.
In the 25 years I’ve been at Amazon, there has been constant change, much of which we’ve initiated ourselves. When I joined Amazon in 1997, we were a books-only retailer, did not have a third-party marketplace, and only shipped to addresses in the US.
This process has led to some expansions thatseem straightforward, and others that some folks might not have initially guessed.

The earliest example is when we chose to expand from just selling Books, to adding categories like Music,Video, Electronics, and Toys.

Back then (1998-1999), it wasn’t universally applauded, but in retrospect, itseems fairly obvious.

The same could be said for our international Stores expansion.

In 2022, our international consumer segment drove $118B of revenue.

In our larger, established international consumer businesses, we’re big enough to be impacted by the slowing macroeconomic conditions; however, the growth in 2019-2021 on a large base was remarkable—30% compound annual growth rate (“CAGR”) in the UK, 26% in Germany, and 21% in Japan (excluding the impact of FX).
In 1998-1999, we expanded from just selling Books to adding categories like Music,Video, Electronics, and Toys. In 2022, our international consumer segment will drive $118B of revenue.
Obsess Over Customers
From the beginning, our focus has been on offering our customers compelling value.

We realized that the Web was, and still is, the World Wide Wait.

Therefore, we set out to offer customers something they simply could not get any other way, and began serving them with books.

We brought them much more selection than was possible in a physical store (our store would now occupy 6 football fields), and presented it in a useful, easy-to-search, and easy-to-browse format in a store open 365 days a year, 24 hours a day.
We set out to offer customers something they simply could not get any other way, and began serving them with books.
Infrastructure
During 1997, we worked hard to expand our business infrastructure to support these greatly increased traffic, sales, and service levels:• Amazon.com’s employee base grew from 158 to 614, and we significantly strengthened our management team.• Distribution center capacity grew from 50,000 to 285,000 square feet, including a 70% expansion of our Seattle facilities and the launch of our second distribution center in Delaware in November.• Inventories rose to over 200,000 titles at year-end, enabling us to improve availability for our customers.• Our cash and investment balances at year-end were $125 million, thanks to our initial public offering in May 1997 and our $75 million loan, affording us substantial strategic flexibility.
In 1997, Amazon.com expanded its business infrastructure to support increased traffic, sales, and service levels, including a 70% expansion of our Seattle facilities and the launch of our second distribution center in Delaware.