Tag Archives: popular science

A Computer Science Approach to Linguistic Archeology and Forensic Science

Last week (Sept 2014),  I heard a story on NPR’s morning edition that really got me thinking…(side note, I’m in Ontario so there is no NPR but my favourite station is WKSU via TuneIn radio on my smart phone). It was a short story, but I thought it was one the most interesting I’ve heard in last few months, and it got me thinking about how computer science has been used to understand natural language cognition.

Linguistic Archeology

Here is a link to the actual story (with transcript). MIT computer scientist Boris Katz realized that when people learn English as second language, they make certain errors that are a function of their native language (e.g. native Russian speakers leave out articles in English). This is not a novel finding, people have known this. Katz, by the way, is one of many scientists that worked with Watson, the IBM computer that competed on jeopardy

Katz trained a computer model to learn from samples of English text productions such that it could detect the writer’s native language based on errors in their written English text. But the model also learned to determine similarities among other native languages. The model discovered, based on errors in English, that Polish and Russian have historical overlap. In short, the model was able to determinethe well know linguistic family tree among many natural languages.

The next step is to use the model to uncover new things about dying or languages. As Katz says

But if those dying languages have left traces in the brains of some of those speakers and those traces show up in the mistakes those speakers make when they’re speaking and writing in English, we can use the errors to learn something about those disappearing languages.”

Computational Linguistic Forensics

This is only one example. Another one that fascinated me was the work of Ian Lancashire, an English professor at the University of Toronto and Graeme Hirst, a professor in the computer science department. The noticed that the output of Agatha Christie—she wrote around 80 novels, and many short stories— declined in quality in her later years. That itself is not surprising, but they thought there was a pattern. After digitizing her work, they analyzed the technical quality of her output and found richness of her vocabulary fell by one-fifth between the earliest two works and the final two works. That, and other patterns, are more consistent with Alzheimer’s than normal aging. In short, they are tentatively diagnosing Christie with Alzheimer disease, based on her written work. You can read a summary HERE and you can read the actual paper HERE.  It’s really cool work.

Text Analysis at Large

I think this work is really fascinating and exciting. It highlights just how much can be understood via text analysis. Some of the this is already commonplace. We educators rely on software to detect plagiarism. Facebook and Google are using these tools as well. One assumes that the NSA might be able to rely on many of these same ideas to infer and predict information and characteristics about the author of some set of written statements. And if a computer can detect a person’s linguistic origin from English textual errors, I’d imagine it can be trained to mimic the same effects and produce English that looks like  it was written by a native speaker of another language…but was not. That’s slightly unnerving…

Gladwell versus the academy (a modern David and Goliath)

I’ll start with an admission: I have never read any of Malcolm Gladwell’s books.

It’s nothing personal or principled, but I just never got around to it;  I tend to prefer reading fiction in my spare time anyway. I have enjoyed some of his essays in the New Yorker, but that’s about it. So I am not writing about the content of his books.  I’m writing about the reception that his book receive, the criticisms, and the apparent belief by many that he’s a scientist. This, it seems, really bothers some actual scientists.

Malcom Gladwell is an enormously successful and gifted writer. No one can argue with this. His books Blink, and The Tipping Point, and Outliers have have made accessible to many people outside the academic and scientific world an understanding of some of the most interesting and exciting ideas in cognition, social psychology, and neuroscience. He has a long career as a journalist, is well read, and he’s no Jonah Leher….

With each book, Gladwell’s stature has grown, but I have noticed the reaction from academics has been less than enthusiastic. Many feel that he misunderstands (or worse, misrepresents) the scientific studies upon which many of his books are built. Dan Simons and Chris Chabris are two of the more vocal critics, and they are both well-respected and well-known scientific psychologists. They argued (in an article posted in the Chronicle of Higher Education that many people were overly enthusiastic about the premises in Blink, namely that intuition can produce better outcomes than analytic cognition. It’s not that they necessarily thought the book was wrong so much that they felt everyone was misinterpreting what it was about. In fact, Simons and Chabris are the authors of The Invisible Gorilla: How Our Intuitions Deceive Us, which argues that human intuitions can be very deceptive. The title, by the way, refers to one of Simons’s most well-know experiments.

They are not the only vocal critics. Steven Pinker is probably closer to Malcolm Gladwell in terms of being a public intellectual (and he has received his fair share of criticism as well). And he too is critical of Gladwell’s books for some of the same reasons. In a review of Outliers,  Pinker writes that “The reasoning in “Outliers,” which consists of cherry-picked anecdotes, post-hoc sophistry and false dichotomies, had me gnawing on my Kindle.”

So now Malcolm Gladwell has a new book, David and Goliath.  As I mentioned before, I have not read this book, so I make no attempts to provide my own critique. But one anecdote in particular seems to have garnered a lot of attention. Gladwell discusses several stories of people who became very successful despite having dyslexia. His thesis seems to be that having dyslexia made it just a little harder for these people to get by, and so maybe they worked a little harder and compensated for the dyslexia and thus achieved greatness. Gladwell calls this  “the theory of desirable difficulty.” He bases this (apparently) on a study from 2007 in which subjects who read a mathematical reasoning problem in a hard-to-read typeface actually outperformed subjects who read the same problems in an easier to read typeface. So there may be a connection, but there may not be.

In a recent review in the WSJ, Christopher Chabris takes Gladwell to task. He points out that the 2007 study in question has not replicated that well. He wonders why Gladwell does not point this out. He wonders why Gladwell asserts as “laws” phenomena with many possible interpretations. The review is critical, and very good, and points out what I really think people should be aware of  when they read Gladwell’s book, namely that  it contains interesting anecdotes mixed with science, and that the writing is very good and persuasive. This need not be a bad thing, and Gladwell and his supportive critics point out that this is a great narrative form, and is exactly what makes Gladwell so good. Stories matter. Narrative matters. But the expanded version on Chabris’s blog went further, and Chabris worries that Gladwell knows full well that people over interpret his books and he simply does not care. He writes “I can certainly think of one gifted writer with a huge audience who doesn’t seem to care that much. I think the effect is the propagation of a lot of wrong beliefs among a vast audience of influential people. And that’s unfortunate.”

Ouch.

Is this envy? I do not think so. Dan Simons and Chabris are successful authors in their own right. So is Steven Pinker. But the difference is that they are also successful academics and researchers. Chabris makes the point that many people simply consider Gladwell to be an authority, rather than an author. The term “Gladwellian” exists.

The review was critical enough to cause Mr Gladwell to respond on Slate.com. Gladwell suggested that “Chabris should calm down”, and  he even takes a mild swipe at Mr. Chabris’ wife. Why so personal? I will confess, that I did not find Gladwell’s Slate response to be very flattering. It came across as arrogant and dismissive. Does Gladwell imagine himself as the David and the Academy as the Goliath? Possibly, though I’m inclined to think the opposite. Gladwell’s “brand” is so big that he is very likely the Goliath in this this fight. And (in keeping with the these of his new book)  his gifts–his incredible writing talent– may very well be what could bring him down.

In the end, I’m glad that this debate is even able to happen. I’m glad that there is a journalist and writer like Malcolm Gladwell  who is interested and exited enough by human behavior and psychology to write best sellers. I’m glad that there are serious and respected scientist like Chabris and Simons to call him out when the claims go to far.

In the course of following these criticisms and counter criticisms  I’ve become much more interested in reading this work. I fully plan to read Gladwell’s book of Essays (What The Dog Saw)  and some of his books. As well, I’m planning to read Simons and Chabris book too. All concerned parties can rest assured  that I’ll be checking them out of my public library soon, and that no actual cash will flow.