The Entity Management Toolkit was built to support the training of new entity models. A fresh install of Salience detects a number of different entity types including people, products and dates. If you require notification of other types of entities, such as medical terms or product features, the Entity Management Toolkit allows you to train new models. After a model is trained and deployed, the new entity type will appear in the metadata produced by Salience. The Entity Management Toolkit is thus one of the tools provided to customize our text analytics suite to meet exactly your needs, and the specifics of the document sets you are interested in.
The principal activity in creating a new entity model is document annotation. To learn the context, words and textual features that define an entity, our lexical training technology needs examples of the new entity type. You provide these by loading documents into the tool and highlighting the occurrences of an entity type in the text. Once a sufficient number have been marked, you can define lexical features that you think might be helpful in figuring out when an entity has occurred. This step is optional: the Entity Management Toolkit ships with a variety of generally useful features already defined. Once you are happy with your annotation and features, you press Train. After a short wait, the Entity Management Toolkit will return with a newly created model, ready for deployment. The Entity Management Toolkit can then be used to test the quality of the results produced by the new model. As you find mistakes in the output of the model you can fix it, preparing new documents for a new training run. By iterating between marking up documents and reviewing newly created models, your model will converge on high quality detection.