<< prepareCollectionFromFile | Back to Salience 6 python Index | getSummary >>


Gets various bits of useful information about the current text including term frequency analysis and document chunk information.
This method provides a wrapper around the underlying C API method lxaGetDocumentDetails.


salience6.getDocumentDetails(oSession, acConfigurationID)


oSession A SalienceSession object previously created via openSession
acConfigurationID An identifier for a configuration added through addConfiguration, or empty string for default configuration


If successful, returns a Python dictionary containing with the following keys:
document_terms A list of tokens contained with the document and their term frequencies
fingerprint A string containing the calculated fingerprint of the document (DEPRECATED)
internal_version A string providing the internal Salience representation of the document after preprocessing
sentences A list of the individual sentences in the document, where each sentence item contains a structure of information about the sentence.
sentence_count An integer giving the count of sentences in the document
word_count An integer giving the count of words in the document


    import salience6 as se6
    session = se6.openSession('/path/to/license.v5','/path/to/data')
    ret = se6.prepareTextFromFile(session,'/path/to/aFile.txt')
    if (ret==0):
        details = se6.getDocumentDetails(session,"")
        print details
        if (ret==6):
            print se6.getLastWarnings(session) 

<< prepareCollectionFromFile | Back to Salience 6 python Index | getSummary >>