User Entity Lists

User entity lists are the Salience equivalent to simple entities and query-defined entities in previous versions of Salience Engine. The format for user entity lists combines the capabilities of these two previous methods of entity extraction.

NOTE: Although user entity lists look like CDL files, there are slight but important differences in the format to bear in mind. Additionally, CDL files are intended to augment model-based entity extraction, and should exist with the subfolders of your salience/entities folder within your user directories. Because the user entity list to use is specified via an option method call, it can exist anywhere on your system.

File format

A user entity list is a tab-delimited file in which column 2 and column 3 are optional.

Column Required? Contents
1 Y The entity to seek
Added in Salience 5.1: The + symbol indicates the use of a query to define the entity
2 N Normalized form for the entity or a set of specifications for entity type, label, and normalized form
3 N A query to define the existence of the entity in the same manner as query-defined entities
Examples:
Lexalytics<tab>Lexalytics, Inc
Lexalytics<tab>label=Text Analytics, type=Company, normalized=Lexalytics, Inc
Lexalytics<tab>type=Company<tab>Lexalytics OR LXA
Example with + operator:
+Lexalytics OR LXA<tab>label=Text Analytics, type=Company, normalized=Lexalytics, Inc
Examples for enforcing no stemming:
+!Lexalytics<tab>label=Text Analytics, type=Company, normalized=Lexalytics, Inc
Lexalytics<tab>label=Text Analytics, type=Company, normalized=Lexalytics, Inc<tab>!Lexalytics
Examples for enforcing case-sensitivity:
+~Lexalytics<tab>label=Text Analytics, type=Company, normalized=Lexalytics, Inc
Lexalytics<tab>label=Text Analytics, type=Company, normalized=Lexalytics, Inc<tab>~Lexalytics
Examples for enforcing no stemming AND case-sensitivity:
+!~Lexalytics<tab>label=Text Analytics, type=Company, normalized=Lexalytics, Inc
Lexalytics<tab>label=Text Analytics, type=Company, normalized=Lexalytics, Inc<tab>!~Lexalytics

Using queries

The examples above show the use of query syntax to define the existence of the entity. This can be very useful in cases of entity extraction which requires a certain amount of logic to disambiguate the entity. Refer to the Query Grammar page for documentation on the query operators and constructs supported.

API Usage

In order for a user entity file to be used, it must be specified through the option method to set the user entity list for the Salience session. After text has been provided to Salience, calls to get user entities (not named entities) will return the user entities that match from the current user entity list.

  1. Initialize Salience session
  2. Call API method to set user entity list (User Entity List option)
  3. Prepare text
  4. Call API method to get user entities