Article: Tracking the Waves of Language with the Ngram Viewer
- Dr. Timothy Smith
- Oct 26, 2022
- 3 min read

Did you ever wonder when certain terms like the word “cyberspace” faded into obscurity while the verb “to google” became a shorthand for searching something on the internet? If so, there is a tool called the Ngram Viewer. Beginning in 2002, Google ambitiously set out to scan all the books in the world and convert these books to searchable text, creating a massive digital library called Google Books. According to J. Howard, the effort to digitize the world’s books from academic libraries stalled out at around 25 million books. (edsurge.com)
The Google Books effort collided with authors and publishers suing Google for copyright infringement and challenging the concept of fair use. A professional group of writers called the Authors Guild filed suit against Google in 2005 to prevent Google Books from publishing passages from copyrighted books online. Initially, the courts sided with the Authors Guild stopping Google’s efforts, but upon appeal, the Second Circuit Court ruled that Google could proceed. (The Atlantic) When the Supreme Court of the United States refused to hear the Authors Guild’s appeal in 2015, the injunction against Google Books ended, and Google could continue its efforts.
The vision of a massive digital library providing full text access to all the world’s books did not materialize, but Google has analyzed the 25 million books in its collection and created a search system called the Ngram Viewer. The Ngram viewer allows anyone to search for the occurrence of terms in books over time. The term, Ngram, refers to a linguistic research tool that predicts the next word in a sequence of words based on how many times that sequence has occurred in all the books in library such as Google Books. For example, “surfed the internet” and “surfed the web” represent three word ngrams, or three-grams. Plugging the two three-grams “surfed the internet” and “surfed the web” into Ngram viewer shows the rapid rise of both three-grams starting in the early 1990s with the emergence of the internet with the “surfed the internet” continuing to grow in usage through 2019, while “surfed the web” starts to fall in popularity starting in 2015.
It is interesting to see other terms come and go from books over time. For example, the term “hashtag” did not appear in any books from the year 1800 to 2005, but it had a meteoric rise in 2006 correlating with the launch of Twitter. Twitter uses the hashtag symbol to index words that allow people to easily follow topics on the Twitter platform. Other terms such as “ditto,” a word we use similar to “likewise,” has steadily declined in usage in books since 1800 to very little reference today. Sometimes, words experience waves of popularity. The word poodle has had a series of spikes in usage with the first surge in 1850, another in the 1950s, and a more recent resurgence in the late 2010s.

Pictured Above: The Ngram of poodle popularity via Ngram viewer
Back in 2002, Google set out to digitize all the books in the world to create a massive library online. The effort ran into lawsuits that halted their efforts at 25 million books. Although the law suit by the Authors Guild eventually got overturned, the vision of digital library of all the worlds book did not materialize. However, Google analyzed all its digitized books and made a search tool called Ngram viewer. Ngram viewer allows anyone to search for the occurrence of words or groups of words as they appear books over the past 200+ years. It provides an interesting way to chart the rise and fall of terms in literature and in culture.
If you are interested in checking out the Ngram viewer for yourself, you can do so here.

Dr. Smith’s career in scientific and information research spans the areas of bioinformatics, artificial intelligence, toxicology, and chemistry. He has published a number of peer-reviewed scientific papers. He has worked over the past seventeen years developing advanced analytics, machine learning, and knowledge management tools to enable research and support high-level decision making. Tim completed his Ph.D. in Toxicology at Cornell University and a Bachelor of Science in chemistry from the University of Washington.
You can buy his book on Amazon in paperback and in kindle format here.


Comments