offers  -hosted AI Capabilities you can use in your system for clause extraction and named entity recognition (NER) models for several standard contract types. Beyond the hosted AI Capabilities, additional options are available through Amazon SageMaker that can be easily configured and trained. Some of these are already set up for seamless integration with your  system. These options are all described in detail below.

AI features require an Enterprise license. Training AI models requires an Extended Enterprise license.

Hosted AI Capabilities

 maintains two custom-built AI Capabilities that can identify, delineate, and extract data from contracts. One specializes in the extraction of metadata such as contracting parties, effective dates, and contract titles. The other AI Capability can extract and name clauses such as the termination clause or limitation of liability. There are three additional models that can, respectively, be used to ask questions about the content of a contract, classify the contract as a specific Contract Type, and compare the meaning of two clauses. These AI Capabilities integrate seamlessly with  using the steps in Setting Up AI.

The metadata extraction and clause extraction AI Capabilities are already trained to extract items common to most contracts, such as termination clauses and job titles, and can output data directly into records in Agiloft using machine learning actions.  can customize each capability to work with your specific contract types to extract a customized set of items. In Agiloft's AI Core, the five capabilities are:

Question Answering

The Question Answering model can pull data from a contract document based on an input that is posed as a question. It can process both binary and extractive questions. Binary refers to questions that require a yes or no answer, whereas extractive refers to questions that require contextual information. Included below are some examples of how the Question Answering model processes the two types of questions. When asked either kind of question, the model outputs an answer, a confidence score, an answer type, a long answer, and passages.

Here is an example of a binary question.

"question": "Does the agreement renew automatically?"

"answer": "No",
"answer score": 0.9982195496559143,
"answer type": "Yes/No",
"long answer": "This Agreement shall be terminated as of the end of the defined term, unless the parties renew the same in writing.",
"passages": "This Agreement shall be terminated as of the end of the defined term, unless the parties renew the same in writing.\nThis Agreement shall be effective as of December 1, 2018 and shall continue until July 31, 2019 (\u201cTermination Date\u201d), subject to the termination provisions contained in paragraph 6.\nIf any provisions of this Agreement are declared invalid and unenforceable, the remainder of this Agreement shall continue in full force and effect.\nD. This eight (8) month Employment Agreement replaces the previous eleven (11) month Employment Agreement, dated December 29, 2017 (the \u201c11-month Employment Agreement\u201d), which terminates on November 30, 2018, between the Company and Mr.\nIt may not be changed orally but only by an agreement in writing signed by any party against whom enforcement of any waiver, change, modification, extension, or discharge is sought.\n",

Here is an example of an extractive question.

"question": "How can the agreement be renewed?"


"answer": "by an agreement in writing",
"answer score": 0.6870729327201843,
"answer type": "Text",
"long answer": "It may not be changed orally but only by an agreement in writing signed by any party against whom enforcement of any waiver, change, modification, extension, or discharge is sought.",
"passages": "It may not be changed orally but only by an agreement in writing signed by any party against whom enforcement of any waiver, change, modification, extension, or discharge is sought.\nThis Agreement shall be terminated as of the end of the defined term, unless the parties renew the same in writing.\nThis Agreement shall be effective as of December 1, 2018 and shall continue until July 31, 2019 (\u201cTermination Date\u201d), subject to the termination provisions contained in paragraph 6.\n",

The Long Answer output generally contains full sentences or longer phrases that explain where model got the answer from. Passages can return an even larger set of related sentences.

ATHENA

The ATHENA models are proprietary models that are used in a shared AI Capability referred to as Contracts AI. The ATHENA models are multimodels that can find both NER and CE data for numerous different Contract Types. The Contracts AI capability currently consists of two ATHENA models: the ATHENA-NER-AS model and the ATHENA-CE-AS model. ATHENA-NER-AS is used to extract metadata, such as Contract Amount, from multiple different types of contracts. ATHENA-CE-AS is used to extract clauses, such as Termination or Payment Description. The ATHENA model is available in KBs that have an Enterprise license and have AI enabled.

The ATHENA models can extract important data from virtually any kind of contract at lightning speed. Starting with  Release 21, the ATHENA models can be configured to only extract metadata or clauses of your choosing using the Labels tab of the machine learning action wizard. This new Labels tab creates a framework where the AI is capable of extracting nearly everything, but can then be tailored down to only extract data relevant to the user's goals. You can configure multiple machine learning actions that use different ATHENA configurations in order to extract different datasets, as well as map extracted data to different fields. This gives users more flexibility around the data that their machine learning actions should extract from contracts, as well as how it is used.

Although ATHENA is available and functional in the current state, it will be continuously improved upon and updated as the Agiloft Data Science team develops and adds more functionality. To get the latest version of  models, go to Setup > Integrations and click Configure under AI.

Amazon SageMaker AI Capabilities

Customized AI Capabilities for CLM are hosted in your own AWS environment, and are only available to you once you connect your AWS account with Agiloft. Although it's preferred you use your own AWS account, Agiloft can create and maintain an AWS account for you and then pass the cost on as a component of your regular invoice if necessary. This ensures the privacy of your data and provides you with the freedom to scale processing capacity as needed.

If you want to go beyond the hosted or customized AI Capabilities, additional algorithms on Amazon SageMaker are also available for use with your  system. SageMaker is only available with an AWS account.  Each algorithm is listed below in more detail.

BlazingText

The BlazingText algorithm implements text classification. It is useful for many downstream natural language processing (NLP) tasks, such as sentiment analysis, named entity recognition, machine translation, and more.  has trained BlazingText on Amazon SageMaker on over 6,500 documents. It can be used to identify the type of contract from six different contract types.

Examples:

Purpose: Text Classification

Training Notes: BlazingText expects training material formatted as a text file. Each line should represent a separate training document, with the first element in the row being the classification label in the format __label__YOURLABEL and then the actual text of the document inserted on the same line.

For additional information, see the Amazon AWS BlazingText documentation.

DeepAR

DeepAR is a Neural Network-based forecasting algorithm. Forecasting is a central problem in many businesses and forecasting algorithms are crucial for most aspects of supply chain optimization. For example, these algorithms are used for inventory management, staff scheduling and topology planning. 

Examples:

For additional information, see the Amazon AWS DeepAR documentation.

Factorization Machines

Factorization machines are used for making predictions on large, sparse data sets, where only a few values are non-zero. Some examples include click predictions or movie recommendations.

Examples:

For additional information, see the Amazon AWS Factorization Machines documentation.

K-Nearest Neighbor (K-NN)

K-NN is a simple algorithm that stores all available cases and classifies new cases based on a similarity measure (e.g., distance functions). The algorithm is automatically non-linear, it can detect linear or non-linear distributed data, and it tends to perform very well with a lot of data points.

Examples:

For each class definition, the model expects a field named label_X where X is a class ID number, and several numerical_feature_Y fields, where Y is a numerical feature ID number incremented for each field. The numerical features can also hold non-numerical values, which are stored as integers and resolved back after inference.

Example fields label_1, numerical_feature_1, numerical_feature_2, numerical_feature_3

Training Notes: This model includes three parameters:

For additional information, see the Amazon AWS K-NN documentation.

Latent Dirichlet Allocation (LDA)

LDA can be used to sort data to a number of topics. Each topic represents a set of words and the distribution of these words in the text. The goal of LDA is to map all input documents to the topics so that the words in each document are mostly captured by those topics. The results from an LDA analysis are often plotted on a scatter chart and colored in order to visualize the distribution of documents between the topics. This model is similar to the NTM model, but it allows a range for how many topics you expect, whereas NTM requires a specific number of topics.

Examples:

Purpose: Topic Modelling

Training NotesThere is no way to specify the expected topics, so this model requires multiple rounds of training, alternated with manual review of the results and adjustment of vocabulary, number of topics, or document selection for the next training round. The model doesn't provide a name for the derived topics, so the trainer or another admin must name the topics manually.

For additional information, see the Amazon AWS LDA documentation.

Neural Topic Model (NTM)

Amazon SageMaker’s Neural Topic Model (NTM) caters to use cases where a finer control of the training, optimization, or hosting of a topic model is required. For example, if you need to train models on texts of a particular writing style or domain, such as legal documents, NTM is well-suited to those needs. This AI Capability is similar to LDA, but it requires a specific value for how many topics you expect, whereas LDA allows a range.

Examples:

Purpose: Topic Modelling

Training Notes: There is no way to specify the expected topics, so this model requires multiple rounds of training, alternated with manual review of the results and adjustment of vocabulary, number of topics, or document selection for the next training round. As part of this process, a specialist can download the model and perform additional handling to see the word clouds generated by the model, and then use the word clouds to determine how best to correct the model's training. If NTM is used to prepare for further classification, a specialist can run inference on all the records to update them with their top-scoring topic, review the results, update records with low-scoring or undetected topics, and then run the data set through BlazingText training to create a best-trained BlazingText model. This process is useful for data sets that are completely unsorted and uncategorized.

For additional information, see the Amazon AWS NTM documentation.

XGBOOST

This AI Capability attempts to accurately predict a target variable by combining an ensemble of estimates from a set of simpler, weaker models. XGBoost is an algorithm that has recently been dominating applied machine learning and Kaggle competitions for structured or tabular data. It is an implementation of gradient boosted decision trees designed for speed and performance. The algorithm can be used to solve a wide variety of regression or classification problems.

Examples:

For additional information, see the Amazon AWS XGBoost documentation.

Related articles