Archive for the ‘Salience Five’ Category

Support Notes: Deploying the Salience assemblies on a Windows network drive

Thursday, December 20th, 2012

This is a brief note from a professional services engagement we have in progress currently. Our professional services folks are developing a custom Excel plugin for a client that uses Salience for analyzing the content contained in their worksheets within the familiar Excel environment.

The Excel plug-in integrates with Salience through a Microsoft Office add-in written in C# and the Salience .Net wrapper. The client wanted the end-user desktops to only have the add-in installed, with Salience and its related data directory residing on a mapped network drive. Read on to see how we accomplished this.


Feature highlight: Complex stems in Salience 5.1

Friday, November 16th, 2012

With every release of Salience, we work to enhance the engine’s ability to extract meaning from the ever-changing landscape of unstructured content. That might sound like a mouthful of marketing mumbo-jumbo, but this is a blog site for developers, so what do I mean by that? What I mean is adding new methods, new techniques, and tweaking existing techniques to handle all kinds of text gracefully, and derive meaning out of the text. One such feature that was added in Salience 5.1 was a new option called Complex Stems. Let’s have a look at how and when you might use this option.


To-may-to, to-mah-to. To Salience it’s all the same

Thursday, November 15th, 2012

Soon after we started releasing support for non-English languages, we started getting questions about what dialects of French or Spanish or Portuguese we were able to analyze. It makes sense, as there are distinct differences within these major languages spoken in different parts of the world. Even in English, you find subtle differences between “American English” and “UK English”. Luckily, for Salience, many of the differences have little impact and those that do can be easily addressed. Let’s look at where these differences appear.


Multi-language support in Salience

Friday, October 12th, 2012

At Lexalytics, we know it’s not only a global marketplace, but a multi-lingual global marketplace. It’s this understanding which has driven us to extend the capabilities of Salience beyond analysis of only English content. This article details the evolution and current state of our support for performing text analytics on English and non-English content.


Sizing your Salience Five deployment

Thursday, December 22nd, 2011

This is another extract from our customer files. Not something that comes up all the time, but often enough that it warranted a blog article with a good worked example.

In general, Salience Engine has been and continues to be very economical in terms of hardware requirements. Text analytics with Salience Engine is more CPU intensive than I/O or memory intensive, though the inclusion of the Concept Matrix™ in Salience Five has increased the memory footprint.

So let’s say you’re looking to process 2 million documents per day, where half are tweets and half are news articles of 4kb or less. What kind of hardware spec are you looking at? Read on to see how you could spec out handling this amount of content with Salience Five.


Entity extraction in Salience Five

Wednesday, December 21st, 2011

I wanted to write up a detailed explanation of the methods of entity extraction available in Salience Five for a client, where they overlap and where they differ. And as I did, I thought, “That would make for a bloody useful blog post for the dev blog.” So here it is.

Prior to Salience 4.x, entity extraction was solely list-based. Salience 4.0 introduced model-based entity extraction, which allowed for novel entity extraction. In other words, “I didn’t think to add ‘John Smith’ to my list of people to extract, but Salience Engine found him in today’s news magically because it knows what names of people look like.” Very powerful stuff.

Salience Five continues to provide model-based and list-based entity extraction found in Salience 4.x, with some of the same cross-over between the two and modification to the terminology.


Analyzing collections with Salience Five

Monday, July 25th, 2011

This is a bit of a forward-looking blog post about new features that we’re debuting in Salience Five. At our Lexalytics User Group meeting in New York in April, we introduced the “collections” functionality that will be provided in Salience Five. Salience Five is in beta right now, so I decided to put together a worked example of collection functionality using some customer review data for Bally’s in Las Vegas gathered from a public website.

Read on to see how you’ll be able to use collections to analyze a group of documents as a cohesive set, extract the commonly occurring themes (with rollup using our concept matrix), and other pieces of actionable data we’re calling facets.