March 29, 2012
We live in an era when artificial intelligence is being used to take over countless tasks that used to be reserved for humans—everything from competing on Jeopardy! to answering phones at call centers. Now a new technology is sure to strike fear into the heart of any journalist, reporter or blogger. Software is being developed that can use raw data—such as Twitter feeds, company earnings reports and baseball box scores—to automatically produce news articles that seem as though they were written by a real live human. For better or worse, welcome to the brave new world of computerized journalism.
The most prominent example is a startup called Narrative Science, which has made waves (and raised $6 million in capital) by pioneering computer software that analyzes these sorts of datasets and writes everything from stock advice to sports analysis.
Previous efforts by other programmers to automate journalism led to formulaic, unvarying articles. But Narrative Science’s cofounders, Kris Hammond and Larry Birnbaum of Northwestern University’s Intelligent Information Lab, have developed algorithms that can do some remarkable things. The software, for example, can interpret box scores to determine an appropriate angle for a game recap, distinguishing between a blowout, a come from behind victory, or a close loss.
Recently, the software has been employed to analyze tweets about political candidates, noting that Newt Gingrich attracted positive public attention by focusing on tax issues, but also received considerable criticism on character issues. Future uses, the company suggests, could include articles on data sets such as crime stats, medical study results and surveys.
The writing may not read like poetry, but it gets the point across in language less stilted than you might expect, and would likely fool readers unaware that a software program wrote the article. In his blog, Just to Clarify, Hammond writes that the company uses an editorial staff with expertise in the field to manually configure the engine for each type of data. The software is proprietary, so publicly available details on how the system works are somewhat vague, but Hammond says that its ability to subtly mimic the human voice is improving all the time.
Although most of the company’s 30 or so clients use the service for internal memos—and, presumably, most news organizations would prefer to keep quiet about their robot-written articles—there are already several examples of published articles that were written using the software. A small section of Forbes.com features articles with the byline “Narrative Science.” The Big Ten Network has used the software to publish nearly instant recaps seconds after games have ended. And Hanley Wood, a construction trade publisher, has employed Narrative Science to comb through data on housing trends and publish articles on its site, builderonline.com.
What are the consequences of this trend? Well, if the software improves to the point that it rivals the work of humans, it could theoretically outcompete traditional journalism, since the cost is so much lower. Last fall, it was reported that Hanley Wood paid roughly $10 for each 500-word article—much less, by most estimates, than the cost of paying actual writers.
Doomsayers may warn that this portends the end of journalism as we know it—the beginning of a world where our news comes to us untouched by human hands and armies of angry writers are out of work. Narrative Science, though, suggests that their software is most useful for small companies looking to extend or enrich their coverage of a previously overlooked area.
We’re not sure who to believe. We can only promise you one thing: This article was written by a real live human.
Sign up for our free email newsletter and receive the best stories from Smithsonian.com each week.