getDocumentCategories

<< explainConceptMatches | Back to Salience 6 python Index | getDocumentClasses >>

Summary

This method returns the categories for a document based on a predefined set of categories, which has been extracted from Wikipedia content classification into a wide spectrum of categories. Customers have the ability to tune the category set through datafiles, allowing certain categories to be excluded from consideration, or tuning other categories through additional terms. Categories are returned as a list of Salience Topic structures.
This method provides a wrapper around the underlying C API method lxaGetDocumentCategories.

Syntax

salience6.getDocumentCategories(oSession, acConfigurationID)

Parameters

oSession A SalienceSession object previously created via openSession
acConfigurationID An identifier for a configuration added through addConfiguration, or empty string for default configuration

Returns

If successful, returns a Python list consisting of items that contain the following information about a document category:
topic The label for the topic/category
type An integer indicating the type of category result: 2=category node, 3=category leaf, 4=category explain info
score An float value indicating the match score for the category
sentiment An float value indicating the sentiment for the category

Example

    import salience6 as se6
    session = se6.openSession('/path/to/license.v5','/path/to/data')
    ret = se6.prepareTextFromFile(session,'/path/to/aFile.txt')
    if (ret==0):
        categories = se6.getDocumentCategories(session, "")
        for category in categories:
            print category["topic"]
    else:
        if (ret==6):
            print se6.getLastWarnings(session) 
    se6.closeSession(session)

<< explainConceptMatches | Back to Salience 6 python Index | getDocumentClasses >>