Archive for the ‘How-to’ Category

Support Notes: Deploying the Salience assemblies on a Windows network drive

Thursday, December 20th, 2012

This is a brief note from a professional services engagement we have in progress currently. Our professional services folks are developing a custom Excel plugin for a client that uses Salience for analyzing the content contained in their worksheets within the familiar Excel environment.

The Excel plug-in integrates with Salience through a Microsoft Office add-in written in C# and the Salience .Net wrapper. The client wanted the end-user desktops to only have the add-in installed, with Salience and its related data directory residing on a mapped network drive. Read on to see how we accomplished this.

(more…)

Feature highlight: Complex stems in Salience 5.1

Friday, November 16th, 2012

With every release of Salience, we work to enhance the engine’s ability to extract meaning from the ever-changing landscape of unstructured content. That might sound like a mouthful of marketing mumbo-jumbo, but this is a blog site for developers, so what do I mean by that? What I mean is adding new methods, new techniques, and tweaking existing techniques to handle all kinds of text gracefully, and derive meaning out of the text. One such feature that was added in Salience 5.1 was a new option called Complex Stems. Let’s have a look at how and when you might use this option.

(more…)

Contacting Lexalytics support services

Thursday, November 1st, 2012

In addition to contacting Lexalytics support services via email at support@lexalytics.com, we have now opened a new channel for contacting us using the Lexalytics support portal at support.lexalytics.com.

This blog article will cover using the support portal to create, edit/update, close, and manage tickets.

(more…)

Using Salience via PowerShell (part 3): Text Files

Friday, August 5th, 2011

Today’s assignment: Convert some docx files to txt and then time how long it takes to process them, getting document sentiment and entities. Use PowerShell.

So first, lets convert the Word documents to text files:

function Save-AsText($fn) {
  $doc = $word.documents.open($fn.ToString())
  $txtName = $fn.ToString().Replace('docx', 'txt')
  $doc.SaveAs([ref] $txtName, [ref] 2)
  $doc.Close()
  echo $txtName
}

$c = Get-ChildItem -recurse -include *.docx
foreach ($fn in $c) {
    Save-AsText($fn)
}

Now that we’ve got our text files, we can use Measure-Command and Measure-Object to do the measuring:

Add-Type -Path "C:\Program Files\Lexalytics\Salience\bin\SalienceEngineFour.NET.dll"
$se = New-Object Lexalytics.SalienceEngine(
             'C:\Program Files\Lexalytics\license.dat',
             "C:\Program Files\Lexalytics\data")
$timings = @()
$c = Get-ChildItem -recurse -include *.txt
$cnt = 0
$s = 0
foreach ($fn in $c) {
   $m = Measure-Command -OutVariable t {
     $rc = $se.PrepareTextFromFile($fn.toString())
     if ($rc -ne 0) {
       echo "Failed to prepare text with code $rc on $fn"
       continue
     }
     $cnt = $se.GetEntities(0, 0, 0, 0, 50, 5) | Measure-Object | Select-Object Count
     $s = $se.GetDocumentSentiment(0).fScore
   }
   $timings += $t[0].TotalMilliseconds
   Write-Host $fn $cnt $s $t[0].TotalMilliseconds
}

$timings | Measure-Object -minimum -maximum -average -sum

And you’ll end up with a summary at the end like this:

Count    : 100
Average  : 511.2
Sum      : 51120
Maximum  : 999
Minimum  : 63

An average of 511 milliseconds per document for the 100 documents processed.

Analyzing collections with Salience Five

Monday, July 25th, 2011

This is a bit of a forward-looking blog post about new features that we’re debuting in Salience Five. At our Lexalytics User Group meeting in New York in April, we introduced the “collections” functionality that will be provided in Salience Five. Salience Five is in beta right now, so I decided to put together a worked example of collection functionality using some customer review data for Bally’s in Las Vegas gathered from a public website.

Read on to see how you’ll be able to use collections to analyze a group of documents as a cohesive set, extract the commonly occurring themes (with rollup using our concept matrix), and other pieces of actionable data we’re calling facets.

(more…)

Salience .NET via IronPython

Friday, April 29th, 2011

To round out my overview of ways to get quickly up and running with scripting Salience on Windows, I’ll conclude with another way to take advantage of the .NET wrapper: IronPython

(more…)

Using Salience via PowerShell (part 2): Tabular data

Saturday, April 23rd, 2011

A common request from customers looking to evaluate Salience Engine is to process sample set of data.  Often this will take the form of an Excel or CSV file where there is a column that contains the text to be processed.   I’m going to show one way of tackling this problem, using PowerShell.

(more…)

Ten things to know about Salience (part 2)

Friday, April 22nd, 2011

This is a follow-on to our list of ten things to know about Salience Engine. Together, these two articles are intended to guide developers in some of the main aspects of working with Salience Engine when they first start out.

In the first part, most of the topics focused on deployment strategies and approaches. In this second part, we’ll look at areas of tuning results from Salience Engine. So let’s roll up our shirt-sleeves and get back into it…

(more…)

Ten things to know about Salience (part 1)

Friday, April 22nd, 2011

I had a meeting with a client recently, and one of the suggestions they raised was a list of the top 10 things that an engineer should know when they start working with Salience Engine. Some of these may seem basic, however it’s not safe to assume that things which seem obvious actually are. With all due respect to David Letterman and his Late Night Top Ten lists, here we go…

(more…)

Using Salience via PowerShell (part 1)

Friday, April 15th, 2011

By way of introduction, my name is Matt King and I’m a Solution Architect in the the Lexalytics Services group. I’m also the guy who brought you the interactive Salience python script the other day.   Most of my current work is on Linux (Python/Java/bash/etc) and both my home and work laptops run OS X as the primary OS.    I do have VMWare Fusion with a Windows OS, but until a day or two ago that was a copy of XP Professional that I dutifully purchased back in 2008.

After upgrading to Windows 7, I was looking around for something to do. As I’ve been hearing good things about PowerShell I figured it was worth checking out.   But what to do with it?   I’d heard that one of the cooler things, besides the object passing pipelines, is that it allows easy access to just about everything via .NET.   And Salience comes with .NET wrapper… (more…)