Tuesday, June 10, 2008

Some Factz About Powerset

Powerset is a new search engine that uses natural language processing (nlp) to return results that are both more varied and more accurate than google's. NLP technology extracts and integrates meaning from linguistic structures and the relationships between words, instead of treating all text as strings of unrelated key terms. It's pretty rockin'.

For example, the queries "What did Hillary say about Bill?" and "What did Bill say about Hillary?" return a fairly similar set of somewhat useful results in google. Scanning them, it is evident that they were brought up by the words "hillary," "bill," and "say." No synonyms or verb conjugations or permutations were returned. Although these sentences contain the same keywords, they don't at all mean the same thing -- but google treats them more or less as if they did.

The results of the same queries in Powerset reflect the subtlety of the tool. It seems to grasp the "aboutness" of the question, and brings back entries that contain more than verbatim pieces of the query. Verbs like "claim" and "vow" and "state" are brought back as variants of "say."

One of the coolest things about Powerset is the list of "Factz" that appears along with the traditional list of links. Factz take the form subject-verb-object and can add up to a wonderful list of simple but unexpected sentences describing the subject of the query.

As with any complex process, its mistakes can be even more impressive than its successes because they reveal how much work it's actually doing. I entered the query "What did Britney do?" and learned that: "Britney speared songs and samples." The parser is so enthusiastic that it went right ahead and parsed her last name! It's endearing, like the errors of overgeneralization children make, but also indicative of the powerful linguistic processes humming beneath the surface.

And although it couldn't tell me where Jimmy Hoffa is, it does know who killed Laura Palmer.

1 comment:

Mark Johnson said...

If only people didn't have last names that were verbs! Seriously, you're absolutely correct that mistakes often show how much work we do as human (and how much work Powerset is doing!) to disambiguate sentences. The Brintey:spears example will only be around for a little while longer, so enjoy it while it's there.