In the span of two days, I received as many emails from respectable content marketing blogs worrying about the dangers of machines taking the jobs of bloggers and other content creators. The man vs. machine dynamic has existed since the dawn of the industrial age, but is it finally reaching the point where a technology called Natural Language Generation (NLG) can replace humans in one of their last refuges? Are we reaching a point where the writing of books, blogs, and even poetry will be dominated by the machine?
What is Natural Language Generation?
Natural Language Generators are algorithms designed to gather inputs and then produce a readable, human-like response based on the data provided. Computer-generated natural language has become fairly commonplace, especially with the advent of smartphone assistants such as Siri and Cortana. On a larger scale, NLG can be leveraged to generate blog posts, articles, or data reports. By programming certain patterns, grammar, and word usage, the output of NLGs can produce content that has been found to be indistinguishable from a human writer.
The NLG algorithms can be simple or extremely complex depending on the desired final product. Philip Parker has used NLG algorithms to produce over a million books on very specific subjects such as luggage racks and vocal cord paralysis. To develop the initial algorithm, he estimates 2-3 man-years of programmer time, but once complete, books can be generated in as little as 4 minutes.
In the case of Parker’s books and most other NLG produced content, you generally need high-quality numerical data. As stated earlier, one of the initial popular uses for NLG were sports scores and statistics. This allows local papers to automatically generate easy to read summaries of extremely local events, such as Little League scores.
These algorithms also allow game summaries to be created almost immediately after the game’s end and automatically provides a narrative, going beyond the simple box score of a baseball game, for instance. The following example from The New York Times shows a summary written by a human and one written by an NLG; can you tell which is which? (Answer at the end of this article.)
“Things looked bleak for the Angels when they trailed by two runs in the ninth inning, but Los Angeles recovered thanks to a key single from Vladimir Guerrero to pull out a 7-6 victory over the Boston Red Sox at Fenway Park on Sunday.”
“The University of Michigan baseball team used a four-run fifth inning to salvage the final game in its three-game weekend series with Iowa, winning 7-5 on Saturday afternoon (April 24) at the Wilpon Baseball Complex, home of historic Ray Fisher Stadium.”
Another area where the NLG has already become a dominant presence is in financial reporting. Forbes and the Associated Press use NLG to expeditiously generate earnings and financial summaries. Once again, the NLG allows for these companies to increase their content generation significantly and be extremely timely in their reporting.
One final interesting usage of NLG algorithms is the L.A. Times Quakebot, which they use to monitor geological data and quickly generate stories or alerts in response to significant seismic activity, such as the 4.7 magnitude quake that hit the L.A. area last year. This allows content to be generated almost in real-time and in an easy-to-read format that could potentially save lives.
“I’m Sorry, Dave, I’m Afraid I Can’t Do That”
So while we greet the opportunities that NLGs can provide with both excitement and fear, it is clear that in the near future content creators are not in danger of losing their jobs. NLG algorithms are only another aid or tool to lessen the content creation burden; they can’t replace a human for a variety of reasons.
One of the current shortcomings of NLGs is their reliance on data and templates. They can do what computers do best, which is analyzing data or numbers and then regurgitating those numbers according to patterns or templates that humans provide.
This pattern recognition can go as far as to mimic certain writing styles and to even generate sonnets in the style of Shakespeare. But again, this is just inputs and patterns and the might of computing power being leveraged to quickly analyze myriad word combinations until it finds those that statistically fit the programmed patterns. Think of it like number crunching but with words—lots of words.
For now, NLG bots and algorithms also do not possess emotion, so they can’t be creative, nor can they develop a style or make design choices such as selecting appropriate clip art to accompany a blog post (I actually had to use this fact to help make my technical writer here at work feel better). Another analogy might be pre-emotion chip Data from Star Trek: The Next Generation. He could play most musical instruments with technical prowess and even emulate the playing style of certain artists, but it was all derivative. He couldn’t develop his own style based on his own experiences and personality.
When it comes down to the user, will they care if a machine created the resources they are finding value in? Depending on the type of content, absolutely not, especially when they need to solve a problem and get out. Do you want a computer writing posts like this (maybe you do) or carrying out journalism on natural disasters or human interest stories? I don’t think so, at least not until scientists can create their own emotion chip. Just as I have mentioned before about finding your voice and using that to set your content apart with your audience, the human touch is still needed to foster a user’s connection with your work. In a sea of content, generated by human and machine alike, it is not trite to remind all content creators that their uniqueness and their personality can’t be replaced by an algorithm.
In my next post we’ll explore the usages of NLG at our fellow agencies. Is your agency already using NLG—or do you see how it could? Share your thoughts in the comments section.
- Oh, and in our example above, a machine wrote the first summary, a human wrote the second, and in the interest of full disclosure, I didn’t guess correctly.You’ve just finished reading the latest article from our Monday column, The Content Corner. This column focuses on helping solve the main content issues facing federal digital professionals, including producing enough content and making that content engaging.