Page tree

Legacy AI Glossary

This page serves as a single source of truth for the acronyms and technical jargon associated with Agiloft's Artificial Intelligence capabilities. 

TermDefinition
AIArtificial Intelligence. The simulation of human intelligence processes by a computer system that has been trained to do so, such as understanding speech and writing, recognizing and identifying visual cues, and even playing games like Chess or Go.
AI CapabilitiesA general term for the AI features that Agiloft supports. An example of an AI capability is being able to extract data with a custom model that can be trained depending on the need of the customer.
AI Project KBThe AI KB template used to host custom models.
AnnotationAn instance of contract data that has been assigned either a clause type or key term type. An annotation may also be referred to as a tag or label.
Annotation PolicyThe policy around how label sets are applied to contract documents during the tagging period.
ATHENA

This model is used to extract clauses and key terms from a contract document. There are two ATHENA models available. One is called ATHENA-CE, which is used for clause extraction (CE). The other is called ATHENA-NER, which is used to extract key terms. A proprietary Agiloft machine learning model that is based on the open-source model DistilRoberta.

ATHENA-CEAn ATHENA model used for clause extraction. The current model can extract 71 types of clauses.
ATHENA-NERAN ATHENA model used for key term extraction. The current model can extract 33 different key terms.
AWS SageMakerThis platform is used to develop and host custom machine learning models, and can be accessed via the AI Credentials table in Agiloft once AI has been enabled. Google Tensorflow integrates into AWS SageMaker.
CEClause Extraction.
CLASSIFICATION-BTThis model is used to determine Contract Type of a contract. It is based on the BlazingText model. It can classify ten different Contract Types.
CML KBThis KB template is used to store and register AI users. This KB template is added to customer servers, and is where the Access Key and Secret Key are generated from. Users can generate these keys by clicking the Get Keys button from the AI Credential record after they click Deploy/Configure AI
Confidence ScoreA score that denotes how confident the model is for a specific instance of data
ContainerUsed to hold multiple submodels.
Custom modelsRefers to models that can be trained on targeted data sets for a specific customer need. While this capability is offered by Agiloft, it requires significant Data Science and Implementation resources. Before floating this option to customers, it's best to consult the Data Science team.
Data ScienceThe study of artificial intelligence, machine learning, and their applications.
standard AI system demoThe AI KB template that should be used when hosting generic models.
DistilRobertaAn open-source model that ATHENA is based off of. This model is pretrained on a legal dictionary of 10 million tokens.
Docker containerTechnology that allows us to package multiple models (submodels) within our larger ATHENA models. This technology is what allows us to run the same ATHENA-CE model for all different kinds of Contract Types.
ExtractionA general term used for when an AI model pulls data, such as a clause or a key term, from a contract document into a KB.
F1A metric that denotes how confident the model is overall that the content of the document is legitimate and risk averse. An overall measure of accuracy of a model, and is usually the preferential benchmark of success. It balances both precision (a measure of false positives) and recall (a measure of false negatives).
Generic modelsRefers to the models that come with the standard AI system demo. They work with all the existing document types included in Agiloft out-of-the-box.
Google TensorflowTensorFlow is an end-to-end open source platform for managing all aspects of a machine learning system, and can integrate into AWS SageMaker for custom model training.
Label setsConsist of the clauses and key terms that you would like an instance of ATHENA to pull from a contract document. Label sets are an important component of the annotation strategy.
Library ClauseA clause that is contained in the Clause Library table.
Machine Learning

A method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that curated systems can learn from data, identify patterns, and make decisions with minimal human intervention, and that these findings can be optimized to perform tasks with an increasing level of accuracy.

Machine Learning actionA machine learning action is a type of action available in KBs that have AI enabled. These actions are used to run AI models. For example, in the standard AI system demo, the Attachments table generally contains an action button called Extract Clauses. This button runs the ATHENA-CE model when clicked.
MultimodelsMultimodels consist of multiple submodels in order to create a more comprehensive extraction process. This is ideal because it allows you to train multiple submodels on individual document types, and then combine them together in a multimodel container to make sure that you can run each one of these submodels from a single location.
NERNamed Entity Recognition. At Agiloft, NER refers to models that extract key terms.
NLPNatural Language Processing. An AI concept that aims to program computers to process and analyze large amounts of natural language data, such as speech or text.
OCROptical Character Recognition.
Question AnsweringAn AI Capability that allows you to ask your KB a question about a contract, and the KB will return the correct piece of data. For example, if you input "Contract Title?" the model may return a value such as Lease Agreement or Non-Disclosure Agreement. You can also ask the model yes or no questions such as "Does the contract contain a Governing Law clause?"
Table Data ExtractionThis form of extraction is focused on topics. Table data extraction and parsing requires custom algorithm development. The base algorithms for this model are Lattice and Stream.
TokensAI models are trained using tokens, which can be thought of as a sequence of characters that are grouped together as a useful semantic unit for processing. A word is an example of a token.
TrainingEssentially, the process of teaching a model what to recognize. Training is accomplished by feeding an AI model many inputs. In this case, these inputs are contract documents that have been annotated with predetermined label sets that are generally based on contract type. As the model sees more examples of contracts, it sees more examples of the annotations that it should recognize. Once a certain threshold is met, the model can begin determining where these labels apply in contract documents it has never processed before, and can accurately annotate a foreign document automatically when prompted. The end product, a model that can recognize data that you program it to recognize, is a textbook example of "machine learning."
SubmodelsSubmodels refer to smaller models that are used in the same container to create larger models. For example, the ATHENA-NER multimodel consists of multiple submodels that have all been trained to work on different Document Types, and have their own set of labels. In Agiloft, these submodels come together to create a more robust extraction library of labels for a multimodel that can be run from a central location.


CONTENTS