I have been playing around with Google’s Book Ngram Viewer. There is a nice and humorous introduction on this TED video, What we learned from 5 million books. It’s an addicting tool that lets you search for words and ideas in a database of 5 million books from across centuries.
Ngram Viewer works rather simply. After you enter a word or phrase (up to five words), the tool displays a graph charting how frequently your term has appeared in books over that half a millennium. By default, the Ngram Viewer taps into books written in English. But you can change that to a different “corpus” or category of books, such as American English, British English, English Fiction, Chinese, French, German, Russian, or Spanish.
You can vary the years tracked, all the way from 1500 to 2008 or anywhere in between. Providing a wide range of years gives you more of an overview, while narrowing the years lets the tool graph a word’s usage in a more granular fashion year by year.
You can enter multiple terms to compare their popularity. For example, typing the two terms “frankfurter” and “hot dog” shows that frankfurter’s usage has remained steady over the years, but the hot dog has continued to jump in popularity since the early 1920s.
But although the Ngram Viewer can tell you how frequently a certain word or phrase has shown up in books, it can’t tell you why. Nor can it necessarily explain the meaning of that word or phrase at the time it was used. So discovering that the word “android” first appeared in books in the mid-18th century is interesting, but did it mean the same to an Enlightenment reader that it does to someone in the era of Google?
You can, however, select a certain year or range of years to view a page that lists the books with your chosen word or phrase. By clicking on a specific book you can see the actually digitized pages, which in some cases can provide a bit of insight into how the word was used at the time.
So I thought I’d do a search from 500 billion words that the Ngram Viewer has added to their database. First I searched for the most popular instrument from a choice of Sax, trumpet, trombone, and clarinet. I got this:
Trumpet wins albeit on a very superficial search criteria. But wait… how about from and more popular instrument grouping: sax,piano,violin, and guitar? When we do that search we find: