Text Segmentation

This model will no longer be supported by November 14, 2024. please try [Jamba 1.5]

Use the Text Segmentation endpoint to intelligently segment text into coherent and readable units, based on distinct topics and lines. This allows for easy breakdown of long text into manageable chunks of text. The endpoint supports both raw text and URLs of webpages as input sources.

Simple API request

source: Raw input text, or URL of a web page.
sourceType: The type of the above - either TEXT or URL.

{
   "source": "https://www.ai21.com/blog/summarizing-legal-documents-for-different-personas-using-ai21-studio",
   "sourceType": "URL"
}

The following is the expected response for the example request above:

{
  "id": "3f4d8be1-3a7f-9d67-c395-66188f72e6c5",
  "segments": [
    {
      "segmentText": "Create your own custom summarizer using AI21 Studio's large language models. This tutorial demonstrates the process on legal documents.",
      "segmentType": "other"
    },
    {
      "segmentText": "We’ve all experienced reading long, tedious, and boring pieces of text - financial reports, legal documents, or terms and conditions (though, who actually reads those terms and conditions to be honest?).\n\nImagine a company that employs hundreds of thousands of employees. In today's information overload age, nearly 30% of the workday is spent dealing with documents. There's no surprise here, given that some of these documents are long and convoluted on purpose (did you know that reading through all your privacy policies would take almost a quarter of a year?). Aside from inefficiency, workers may simply refrain from reading some documents (for example, Only 16% of Employees Read Their Employment Contracts Entirely Before Signing!). \n\nThis is where AI-driven summarization tools can be helpful: instead of reading entire documents, which is tedious and time-consuming, users can (ideally) quickly extract relevant information from a text. With large language models, the development of those tools is easier than ever, and you can offer your users a summary that is specifically tailored to their preferences.",
      "segmentType": "normal_text"
    },
    {
      "segmentText": "Let's take legal documents, for example. Though they are written in English, many people find legal documents to be difficult to comprehend, as if they were actually written in a foreign language. Moreover, the interesting parts of each document may differ depending on the person who reads it, so off-the-shelf summarization tools may be too general or too specific. As an example, let's look at the involved personas:",
      "segmentType": "normal_text"
    },
    {
      "segmentText": "Lawyers. They are interested in several key points, but legal language and terms are especially relevant to them.Your average Joe. Doesn’t understand all the legal terms, and really wants to get the gist in simple words.",
      "segmentType": "normal_text_short"
    },
    {
      "segmentText": "",
      "segmentType": "other"
    },
    {
      "segmentText": "Using the IRS as our example, we will demonstrate here how to build a simple summarizer for those 2 personas, and discuss which future directions one should consider.",
      "segmentType": "normal_text_short"
    },
    {
      "segmentText": "Working with large language models",
      "segmentType": "h2"
    },
    {
      "segmentText": "Large language models naturally follow patterns in input (prompt), and provide coherent completion that follows the same patterns. For that, we want to feed them with several examples in the input (\"few-shot prompt\"), so they can follow through. The process of creating the correct prompt for your problem is called prompt engineering, and you can read more about it here.",
      "segmentType": "normal_text"
    },
    {
      "segmentText": "Collecting data",
      "segmentType": "h2"
    },
    {
      "segmentText": "Ideally, you should have a variety of examples to draw from. It is important that those examples represent the actual documents that your summarizer should work on. This means they should be written in the same way as real world documents, and they should be as varied as possible.",
      "segmentType": "normal_text_short"
    },
    {
      "segmentText": "Don't have it? No need to worry - large language models generalize exceptionally well, so the results will still be good. In the end, optimal results can always be achieved with proper data - but you can revisit this at a later date.",
      "segmentType": "normal_text_short"
    },
    {
      "segmentText": "Summarize for lawyers",
      "segmentType": "h2"
    },
    {
      "segmentText": "If you are summarizing legal documents for lawyers, you should definitely maintain all legal terms from the original document. In this demonstration, we will work on the opening paragraphs of legal letters.",
      "segmentType": "normal_text_short"
    },
    {
      "segmentText": "Here is an example of a few-shot prompt:",
      "segmentType": "other"
    },
    {
      "segmentText": "\nYou will now be presented with a part of a legal letter. Read it and summarize it. Make sure the summary maintains formal terms.\nLetter:\nDear [Name],\nThis letter responds to your authorized representatives’ letter of May 19, 2022, and subsequent correspondence, requesting rulings regarding the income, estate, gift, and generation-skipping transfer (GST) tax consequences of the proposed division of Trust 1.\nThe facts and representations submitted are as follows:\nGrandparent died on Date 1, leaving the residue of Grandparent’s probate estate in equal shares to Trust 1 and Trust 2, testamentary trusts established under Will. Trust 1 is held for the benefit of Grandchild 1 and Grandchild 1’s descendants, and Trust 2 is held for the benefit of Grandchild 2 and Grandchild 2’s descendants. Grandchild 1 has five children, Great-Grandchild 1, Great-Grandchild 2, Great-Grandchild 3, GreatGrandchild 4, and Great-Grandchild 5. Grandchild 2 has one child, Great-Grandchild 6. Neither Grandchild 1 nor Grandchild 2 has any deceased children. Trustees serve as trustees of Trust 1. Trust 1 is the subject of this ruling request.\nSection 9(a) of Trust 1 provides that Trustees may distribute the net income of Trust 1 to or for the benefit of Grandchild 1 and Grandchild 1’s descendants in such proportions and at such times as Trustees determine is desirable or necessary, considering their needs, best interests, and other sources of income, or may annually accumulate and add all or part of the net income to the principal of Trust 1.\nSection 9(b) of Trust 1 provides that Trustees may distribute the principal of Trust 1 to or for the benefit of Grandchild 1 and Grandchild 1’s descendants in such proportions PLR-110419-22 3 and at such times as Trustees determine is desirable or necessary for their medical care, comfortable maintenance, education, or general support and welfare, considering their other resources.\nSection 9(c) of Trust 1 provides that Trust 1 will terminate 21 years after the death of the survivor of Grandchild 1 and Grandchild 2, and the principal of Trust 1 will be distributed to Grandchild 1’s descendants, per stirpes. If, however, Grandchild 1 and Grandchild 1’s descendants all die before that date, Trust 1 will terminate early, and the principal of Trust 1 will be distributed to Trust 2.\nSummary:\n- Grandparent died and left the residue of his estate to Trust 1 and Trust 2, testamentary trusts established under Will. The trustees of Trust 1 want to divide the estate equally between Grandchild 1 and Grandchild 2.\n- Section 9(a) of Trust 1 provides that Trustees may distribute the net income to Grandchild 1 and Grandchild 1's descendants in any proportion they determine is desirable.\n- Section 9(b) of Trust 1 provides that Trustees may distribute principal to Grandchild 1 and Grandchild 1's descendants as they determine is desirable.\n- Section 9(c) of Trust 1 provides that if Grandchild 1 and Grandchild 1's descendants all die before 21 years, Trust 1 will terminate early.\n##\nYou will now be presented with a part of a legal letter. Read it and summarize it. Make sure the summary maintains formal terms.\nLetter:\nDear [Name],\nThis letter ruling is in response to a request from your authorized representative dated March 11, 2022, and subsequent documentation requesting an extension of an additional five years under Internal Revenue Code (“IRC”) section 4943(c)(7) for disposing of certain excess business holdings. Taxpayer represents the facts as follows.\nFACTS\nTaxpayer was incorporated as a State nonprofit corporation. Taxpayer is exempt from federal income tax under IRC section 501(a) as an organization described in IRC section 501(c)(3) and is classified as a private foundation under IRC section 509(a). Taxpayer was created by Individuals to support the domestic and international Community and various charitable organizations. Individuals were substantial contributors to Taxpayer within the meaning of IRC section 507(d)(2), and therefore disqualified persons with respect to Taxpayer under IRC section 4946(a)(1)(A).\nAs a result of Individuals’ death, Taxpayer received A shares of Entity 1 common voting stock from Trust, a disqualified person, on Date 1, which Taxpayer represents is an unusually large testamentary gift or bequest. Additionally, on Date 2, Granddaughter, who is the granddaughter of one of the Individuals, and who serves as a co-trustee of Taxpayer, received a testamentary disposition from Trust of B shares of Entity 1 common voting stock. Granddaughter is also a disqualified person under IRC section 4946. The combined shares of Taxpayer and Granddaughter represent approximately C percent of Entity 1’s outstanding capital stock and are a minority interest in Entity 1. As a result of the testamentary gift or bequest of the A shares, Taxpayer has excess business holdings of Entity 1 under IRC section 4943(c)(1).\nOn Date 3, also as a result of Individuals’ death, Taxpayer received from Trust an approximate Z percent membership interest in Entity 2, an LLC, which taxpayer represents is an unusually large testamentary gift or bequest. As a result of the testamentary gift or bequest of the LLC membership interest, which is a minority interest, Taxpayer has excess business holdings of Entity 2 under IRC section 4943(c)(1).\nSummary:\n- A taxpayer requested an extension of five years under IRC section 4943(c)(7) for disposing of certain excess business holdings.\n- A testamentary gift of stock to Taxpayer and a testamentary disposition to Granddaughter, both disqualified persons, resulted in Taxpayer having excess business holdings of Entity 1 under IRC section 4943(c)(1).\n- On Date 3, Taxpayer received an unusually large testamentary gift or bequest of an LLC membership interest, which is an excess business holding.\n##\nYou will now be presented with a part of a legal letter. Read it and summarize it. Make sure the summary maintains formal terms.\nLetter:\nDear [Name],\nThis letter responds to a letter dated March 15, 2022, submitted on behalf of S Corporation Target, Purchaser, and Shareholder (collectively, the \"Parties\"), requesting an extension of time under §301.9100-3 of the Procedure and Administration Regulations to file an election. The Parties are requesting an extension of time to file the election statement under §1.336-2(h)(3)(iii) of the Income Tax Regulations (the \"Election Statement\") with respect to Purchaser's acquisition of all the stock of S Corporation Target from Shareholder on Date 1. The material information submitted is summarized below.\nOn Date 1, Purchaser acquired all the stock of S Corporation Target from Shareholder (the \"Stock Disposition\"). It has been represented that the Stock Disposition qualified as a \"qualified stock disposition\" as defined in §1.336-1(b)(6).\nThe Parties intended to make a section 336(e) election for the Stock Disposition but, for various reasons, a timely election was not fully made. Subsequently, this request was submitted, under §301.9100-3, for an extension of time to file the Election Statement. The Parties each represented that they are not seeking to alter a return position for which an accuracy-related penalty has been or could be imposed under section 6662.\nRegulations promulgated under section 336(e) permit certain sales, exchanges, or distributions of stock of a corporation to be treated as asset dispositions if: (1) the disposition is a \"qualified stock disposition\" as defined in §1.336-1(b)(6); and (2) a section 336(e) election is made.\nSummary:\n",
      "segmentType": "other"
    },
    {
      "segmentText": "And the completion:",
      "segmentType": "h3"
    },
    {
      "segmentText": "- The Parties requested an extension of time under §301.9100-3 of the Procedure and Administration Regulations to file an election.- The Parties are requesting an extension of time to file the Election Statement with respect to Purchaser's acquisition of all the stock of S Corporation Target from Shareholder on Date 1.- The Parties represented that they are not seeking to alter a return position for which an accuracy-related penalty has been or could be imposed under section 6662.##",
      "segmentType": "other"
    },
    {
      "segmentText": "Notes",
      "segmentType": "h3"
    },
    {
      "segmentText": "It is wise to try several phrasings for your prompt. You can, for instance, keep it simple and direct by writing:",
      "segmentType": "normal_text_short"
    },
    {
      "segmentText": "\nSummarize the following part of a legal letter while maintaining formal terms.\nLetter:\n[LETTER]\nSummary:\n  ",
      "segmentType": "other"
    },
    {
      "segmentText": "Alternatively, you can take a more elaborate approach, such as:",
      "segmentType": "normal_text_short"
    },
    {
      "segmentText": "\nYou are LegalAI, an AI legal assistant that excels at summarizing legal documents.\nBelow you will find part of a legal letter. Read it and summarize it. Make sure the summary maintains formal terms.\nLetter:\n[LETTER]\nSummary:\n  ",
      "segmentType": "other"
    },
    {
      "segmentText": "As you can see, we are providing the model with several examples, separated by a stop sequence, which is easily spotted while reading the text. You can read more about stop sequences here.Because this task requires high accuracy, we recommend working at a low temperature. It's best to explore the range of 0.0-0.3, since temperature 0 does tend to produce short summaries. You can read more about temperature here.You can test several options for the prompt in our playground.There is a consistent structure to all the examples in the prompt. This few-shot prompt can be created using the following code (assuming you have the paragraphs and summaries):",
      "segmentType": "normal_text"
    },
    {
      "segmentText": "def make_single_example(letter, summary):\n    example = \"You will now be presented with a part of a legal letter. Read it and summarize it. Make sure the summary maintains formal terms.\\n\"\n    example += \"Letter:\\n\"\n    example += letter\n    example += \"\\n\"\n    example += \"Summary:\\n\"\n    example += summary\n    \n    return example\n\n# This is the stop sequence\nSEPARATOR = \"\\n##\\n\"\n\nFEW_SHOT_PREFIX = SEPARATOR.join(\n    make_single_example(letter, summary) for letter, summary in zip(letters, summaries)\n)\n\ndef make_few_shot_prompt(letter):\n    \n    return FEW_SHOT_PREFIX + SEPARATOR + make_single_example(letter, '') # keep the summary empty and let the model complete\n",
      "segmentType": "other"
    },
    {
      "segmentText": "Summarize for the average Joe",
      "segmentType": "h2"
    },
    {
      "segmentText": "Despite the fact that there are over 1.3 million lawyers in the United States alone, most people  (still) aren't lawyers and have trouble understanding legal documents. In this case, we probably want the summary to be written in a simple, easy-to-understand manner.\n\nAs this task involves both summarizing and simplifying texts, it is inherently more difficult. Thus, we will probably need more \"shots\" in our few-shot prompt (in this case, we added another example to the prompt).",
      "segmentType": "normal_text"
    },
    {
      "segmentText": "Here is an example of a few-shot prompt:",
      "segmentType": "other"
    },
    {
      "segmentText": "\nYou will now be presented with a part of a legal letter. Read it and summarize it. Make sure the summary is short and written in simple words.\nLetter:\nDear [Name],\nThis letter responds to your authorized representatives’ letter of May 19, 2022, and subsequent correspondence, requesting rulings regarding the income, estate, gift, and generation-skipping transfer (GST) tax consequences of the proposed division of Trust 1.\nThe facts and representations submitted are as follows:\nGrandparent died on Date 1, leaving the residue of Grandparent’s probate estate in equal shares to Trust 1 and Trust 2, testamentary trusts established under Will. Trust 1 is held for the benefit of Grandchild 1 and Grandchild 1’s descendants, and Trust 2 is held for the benefit of Grandchild 2 and Grandchild 2’s descendants. Grandchild 1 has five children, Great-Grandchild 1, Great-Grandchild 2, Great-Grandchild 3, GreatGrandchild 4, and Great-Grandchild 5. Grandchild 2 has one child, Great-Grandchild 6. Neither Grandchild 1 nor Grandchild 2 has any deceased children. Trustees serve as trustees of Trust 1. Trust 1 is the subject of this ruling request.\nSection 9(a) of Trust 1 provides that Trustees may distribute the net income of Trust 1 to or for the benefit of Grandchild 1 and Grandchild 1’s descendants in such proportions and at such times as Trustees determine is desirable or necessary, considering their needs, best interests, and other sources of income, or may annually accumulate and add all or part of the net income to the principal of Trust 1.\nSection 9(b) of Trust 1 provides that Trustees may distribute the principal of Trust 1 to or for the benefit of Grandchild 1 and Grandchild 1’s descendants in such proportions PLR-110419-22 3 and at such times as Trustees determine is desirable or necessary for their medical care, comfortable maintenance, education, or general support and welfare, considering their other resources.\nSection 9(c) of Trust 1 provides that Trust 1 will terminate 21 years after the death of the survivor of Grandchild 1 and Grandchild 2, and the principal of Trust 1 will be distributed to Grandchild 1’s descendants, per stirpes. If, however, Grandchild 1 and Grandchild 1’s descendants all die before that date, Trust 1 will terminate early, and the principal of Trust 1 will be distributed to Trust 2.\nSummary:\nThe purpose of this letter is to describe a trust set up by a grandparent for the benefit of their grandchildren and great-grandchildren. The trustees may distribute money from the trust, but they must follow certain rules. After 21 years, the trust will end, and the remaining money will be distributed to the main beneficiary's descendants. If all the descendants of the main beneficiary pass away before then, the remaining money will be transferred to another trust.\n##\nYou will now be presented with a part of a legal letter. Read it and summarize it. Make sure the summary is short and written in simple words.\nLetter:\nDear [Name],\nThis letter ruling is in response to a request from your authorized representative dated March 11, 2022, and subsequent documentation requesting an extension of an additional five years under Internal Revenue Code (“IRC”) section 4943(c)(7) for disposing of certain excess business holdings. Taxpayer represents the facts as follows.\nFACTS\nTaxpayer was incorporated as a State nonprofit corporation. Taxpayer is exempt from federal income tax under IRC section 501(a) as an organization described in IRC section 501(c)(3) and is classified as a private foundation under IRC section 509(a). Taxpayer was created by Individuals to support the domestic and international Community and various charitable organizations. Individuals were substantial contributors to Taxpayer within the meaning of IRC section 507(d)(2), and therefore disqualified persons with respect to Taxpayer under IRC section 4946(a)(1)(A).\nAs a result of Individuals’ death, Taxpayer received A shares of Entity 1 common voting stock from Trust, a disqualified person, on Date 1, which Taxpayer represents is an unusually large testamentary gift or bequest. Additionally, on Date 2, Granddaughter, who is the granddaughter of one of the Individuals, and who serves as a co-trustee of Taxpayer, received a testamentary disposition from Trust of B shares of Entity 1 common voting stock. Granddaughter is also a disqualified person under IRC section 4946. The combined shares of Taxpayer and Granddaughter represent approximately C percent of Entity 1’s outstanding capital stock and are a minority interest in Entity 1. As a result of the testamentary gift or bequest of the A shares, Taxpayer has excess business holdings of Entity 1 under IRC section 4943(c)(1).\nOn Date 3, also as a result of Individuals’ death, Taxpayer received from Trust an approximate Z percent membership interest in Entity 2, an LLC, which taxpayer represents is an unusually large testamentary gift or bequest. As a result of the testamentary gift or bequest of the LLC membership interest, which is a minority interest, Taxpayer has excess business holdings of Entity 2 under IRC section 4943(c)(1).\nSummary:\nA nonprofit organization asks for more time to sell extra stocks it received after the deaths of a few key contributors. Nonprofit organizations have a limit on how many stocks they can keep, so they request an extension to get rid of them.\n##\nYou will now be presented with a part of a legal letter. Read it and summarize it. Make sure the summary is short and written in simple words.\nLetter:\nDear [Name],\nThis letter responds to a letter dated March 15, 2022, submitted on behalf of S Corporation Target, Purchaser, and Shareholder (collectively, the \"Parties\"), requesting an extension of time under §301.9100-3 of the Procedure and Administration Regulations to file an election. The Parties are requesting an extension of time to file the election statement under §1.336-2(h)(3)(iii) of the Income Tax Regulations (the \"Election Statement\") with respect to Purchaser's acquisition of all the stock of S Corporation Target from Shareholder on Date 1. The material information submitted is summarized below.\nOn Date 1, Purchaser acquired all the stock of S Corporation Target from Shareholder (the \"Stock Disposition\"). It has been represented that the Stock Disposition qualified as a \"qualified stock disposition\" as defined in §1.336-1(b)(6).\nThe Parties intended to make a section 336(e) election for the Stock Disposition but, for various reasons, a timely election was not fully made. Subsequently, this request was submitted, under §301.9100-3, for an extension of time to file the Election Statement. The Parties each represented that they are not seeking to alter a return position for which an accuracy-related penalty has been or could be imposed under section 6662.\nRegulations promulgated under section 336(e) permit certain sales, exchanges, or distributions of stock of a corporation to be treated as asset dispositions if: (1) the disposition is a \"qualified stock disposition\" as defined in §1.336-1(b)(6); and (2) a section 336(e) election is made.\nSummary:\n",
      "segmentType": "other"
    },
    {
      "segmentText": "Due to the difficulty of this task, this prompt is not sufficient. By adding another example (\"shot\"), however, we will get much better results. Below you can see the completion:",
      "segmentType": "normal_text_short"
    },
    {
      "segmentText": "A corporation asks the government for more time to file a statement related to its financial performance for a specific tax year. The government has granted the request but may check the information provided by the company later.<br>##",
      "segmentType": "other"
    },
    {
      "segmentText": "Notes:",
      "segmentType": "h3"
    },
    {
      "segmentText": "Like in the previous case, you should try several phrasings for your prompt.You can test several options for the prompt in our playground.Alternatively, you could have summarized for lawyers first, and then simplified the summary (using a different prompt). This method may, however, result in some information being lost.",
      "segmentType": "normal_text_short"
    },
    {
      "segmentText": "Summary",
      "segmentType": "h2"
    },
    {
      "segmentText": "In this post, we explore the use-case of summarization.If you need a simple, off-the-shelf summarizer, be sure to check out our specialized summarization API. However, for more specific use-cases and customization, it’s wise to get closer to the core with the help of large language models, as custom models that are tailored to your specific needs will always get you higher quality results. Click here to learn more.",
      "segmentType": "normal_text"
    }
  ]
}