<Lexalytics root>/data/structure

<<Stopwords | Back to Data Directory index | Tagger>>

This directory contains datafiles used in the analysis of the structure and layout of plain-text. More detailed information is provided below the list of files.

entityevidence.dat Controls all of the entities extracted for lists and tables
listtypes.dat Controls all of the bullet characters for lists
sentimentevidence.hsd A basic HSD file that controls sentiment extracted for lists and tables

Customizing functionality in user/structure

entityevidence.dat

This file controls all of the entities extracted for lists and tables. A user will want to add to this file in order to get more entities extracted from them. It is important to note that this file is "everything" sensitive, meaning entities will only be extracted if the header of the list/table EXACTLY matches anything from this file. Three basic examples are provided in the default file, but this might not be necessary for many users, and it is important that they change it depending on what content they're looking at and what tables/lists they would expect from this content.


listtypes.dat

This data file controls all of the bullet characters for lists. Users will want to modify this if they anticipate that they will be seeing different bullet point characters. This file most likely won't be changed very often for the average user (most common bullet points should be covered).


sentimentevidence.hsd

A basic HSD file that controls sentiment extracted for lists and tables. Once again, the header has to match exactly to the entries in this file. Users will want to tune this file according to the content they are analyzing for the session.