Saturday, September 24, 2005


It seems that while Google Print manages to get themselves into hot water with the publishers, and librarians try to prove why we are better than Google, Amazon is trying to stay out of the spotlight. I just discovered a new feature on Amazon that was released in April! Maybe it passed by without me noticing it (it was a busy summer), but it looks like the Washington Post just noticed it last month. Amazon has enhanced their Search Inside features to include a list of SIPs (Statistically Improbably Phrases) and CAPs (Capitalized Phrases). According to Amazon:

"'s Statistically Improbable Phrases, or "SIPs", are the most distinctive phrases in the text of books in the Search Inside!™ program. To identify SIPs, our computers scan the text of all books in the Search Inside! program. If they find a phrase that occurs a large number of times in a particular book relative to all Search Inside! books, that phrase is a SIP in that book.

SIPs are not necessarily improbable within a particular book, but they are improbable relative to all books in Search Inside!. For example, most SIPs for a book on taxes are tax related. But because we display SIPs in order of their improbability score, the first SIPs will be on tax topics that this book mentions more often than other tax books. For works of fiction, SIPs tend to be distinctive word combinations that often hint at important plot elements.

Click on a SIP to view a list of books in which the phrase occurs."

"Capitalized Phrases, or "CAPs", are people, places, events, or important topics mentioned frequently in a book. Along with our Statistically Improbable Phrases, Capitalized Phrases give you a quick glimpse into a book's contents.

Click on a Capitalized Phrase to view a list of books in which the phrase occurs. You can also view a list of references to the Capitalized Phrase in each book.

For example, if you're looking at a Sherlock Holmes mystery, you can click on "Professor Moriarty" to see a list of books that feature or mention Holmes's nemesis."

The SIPs basically create subject headings for the book, and the CAPs include every person named in the book (which is more than librarians do when creating MARC records). Both of these fields link to the same words in other books. It kind of reminds me of our catalogs, only created by a computer. Although the SIPs aren't using controlled vocabulary, it would only take some mapping to link it to other books with the same subject.

I was surprised when I found this feature, and hadn't heard about it before. While it seems that I hear about another cool Google feature every day, Amazon is quietly releasing their new features.


Post a Comment

<< Home