Google’s new Ngram Viewer is a graphical interface for looking at the frequency of words over time in the several million books scanned into their database. As a publicly mine-able data set, it’s huge and ripe for exploration with 500 years’ worth of published books spanning several languages. And while it may seem a simple ‘just so’ kind of information to be able to call up how often a word was used in a particular year, the lives of words can often illuminate historical and cultural trends in surprising ways.
A paper published by researchers who helped develop the project (and summarized by Discover) rounded up a few interesting findings. One delectably recursive tidbit they mentioned was that a search for years (ie. 1865, 1990) can show the historical efforts focused on particular eras and the extent to which those years remain part of present day discussion.
They found a general trend each individual year follows: a spike just before the year followed by a downward trending long tail as it recedes into history. They also, however, noticed a trend amongst that pattern: higher peaks with shorter tails.
When the team looked at the frequency of individual years, they found a consistent pattern. In their own words: “’1951’ was rarely discussed until the years immediately preceding 1951. Its frequency soared in 1951, remained high for three years, and then underwent a rapid decay, dropping by half over the next fifteen years.” But the shape of these graphs is changing. The peak gets higher with every year and we are forgetting our past with greater speed. The half-life of ‘1880’ was 32 years, but that of ‘1973’ was a mere 10 years.
So, at a cultural level, we can see a developing ‘presentism’ in which the year we’re currently inhabiting takes on great significance, but is more quickly forgotten once it’s passed.