Sign up FAST! Login

Literary History, Seen Through Big Data’s Lens -

Literary History Seen Through Big Data s Lens NYTimes com


Stashed in: Google!, Words!, History!, Big Data!, Communication, Advertising, Books!

To save this post, select a stash from drop-down menu or type in a new one:

This is surprising:

ANY list of the leading novelists of the 19th century, writing in English, would almost surely include Charles Dickens, Thomas Hardy, Herman Melville, Nathaniel Hawthorne and Mark Twain.

But they do not appear at the top of a list of the most influential writers of their time. Instead, a recent study has found, Jane Austen, author of “Pride and Prejudice, “ and Sir Walter Scott, the creator of “Ivanhoe,” had the greatest effect on other authors, in terms of writing style and themes.

These two were “the literary equivalent of Homo erectus, or, if you prefer, Adam and Eve,” Matthew L. Jockers wrote in research published last year. He based his conclusion on an analysis of 3,592 works published from 1780 to 1900. It was a lot of digging, and a computer did it.

I've read some Jane Austen, but I've never read anything from Sir Walter Scott.

To the library!


Google cooperated and built the software for making graphs open to the public. The initial version of Google’s cultural exploration site began at the end of 2010, based on more than five million books, dating from 1500. By now, Google has scanned 20 million books, and the site is used 50 times a minute. For example, type in “women” in comparison to “men,” and you see that for centuries the number of references to men dwarfed those for women. The crossover came in 1985, with women ahead ever since.

In work published in Science magazine in 2011, Mr. Michel and the research team tapped the Google Books data to find how quickly the past fades from books. For instance, references to “1880,” which peaked in that year, fell to half by 1912, a lag of 32 years. By contrast, “1973” declined to half its peak by 1983, only 10 years later. 

“We are forgetting our past faster with each passing year,” the authors wrote.

Anecdotally, I believe that we're forgetting our past faster and faster.

This is quite clever:

To train their statistical algorithms on common sentence structure, word order and most widely used words, they fed their computers a huge archive of articles from news wires. The memorable lines consisted of surprising words embedded in sentences of ordinary structure.

“We can think of memorable quotes as consisting of unusual word choices built on a scaffolding of common part-of-speech patterns,” their study said.

Consider the line “You had me at hello,” from the movie “Jerry Maguire.” It is, Mr. Kleinberg notes, basically the same sequence of parts of speech as the quotidian “I met him in Boston.” Or consider this line from “Apocalypse Now”: “I love the smell of napalm in the morning.” Only one word separates that utterance from this: “I love the smell of coffee in the morning.”

This kind of analysis can be used for all kinds of communications, including advertising.

Indeed, Mr. Kleinberg’s group also looked at ad slogans. Statistically, the ones most similar to memorable movie quotes included “Quality never goes out of style,” for Levi’s jeans, and “Come to Marlboro Country,” for Marlboro cigarettes.

Those who can master the nuance of language can find success in anything.

You May Also Like: