[In-depth]The rushing army of manuscript writing robots!

AsiaIndustrial NetNews: Last Friday, the news that Toutiao was valued at more than $12 billion was swiped. In the Red Sea of ​​the content market, Toutiao, whose value has skyrocketed, has become a “big fish” that makes BAT intimidated. There are many variables and possibilities between oligarchs and strong men, and technical advantages can affect the whole body.

[In-depth]The rushing army of manuscript writing robots!

A product of the combination of content market competition and artificial intelligence technology – writingrobotfocusing on the new battles of Baidu, Alibaba, Tencent, and Toutiao, and the gradually emerging machine writing industry itself has become a particularly important part of the battle.

In my in-depth investigation of the “machine writing” industry, two unanticipated fait accompli have led me to re-examine artificial intelligence and content production. These two facts are:

1) In reports in vertical fields, manuscript writing Robots have been adopted with high frequency;

2) Tencent, Toutiao, Ali and Baidu are the earliest and most mature platforms for the application of manuscript writing robot technology in China.

Understanding the application status of artificial intelligence technology in professional fields can more intuitively feel the prelude to the approaching technological revolution. Further, its far-reaching significance in the content field is that it will bring a subversive impact on the upstream information production process and data application methods of the internet.

At present, this key valve is in the hands of the BAT three giants and the new rival with a valuation of over 12 billion US dollars. To this end, Zhidongte interviewed Liu Kang, head of Tencent’s content robot project and deputy director of Tencent Finance, Alibaba’s big data value mining expert, first financial chief data expert Dr. Tang Kaizhi, today’s headline related experts, and industry veterans to further deconstruct This battle for the entrance of Internet content and data reorganization is a glimpse into the larger-scale text paradigm generation market behind it.

Note: Recently, it is reported that Baidu has launched its intelligent writing robot Writing-bots, but according to the author’s investigation, there is no actual application case that can be verified, so it will not be discussed in this article for the time being. Take the event commentary function of “Du Mi” as a reference. )

1. The robot behind the text

The writing robot does not refer to the physical robot itself, but the abstraction and anthropomorphism of the system that automatically generates text and produces content. Specifically, most of the manuscript writing robots are based on a specific information database, and through certain information processing methods such as screening, analysis, and calculation, the information is recombined and arranged, and the pre-set writing template is applied, and finally the news report is output. .

“Machine writing” involves a number of artificial intelligence technologies such as data mining, natural language processing, machine learning, search technology, and knowledge graphs. Applying the element model of general artificial intelligence, “specific information base” is the “big data” that supports this technology, and “reorganization and arrangement of text information” is the core algorithm behind the product. From the early manual setting of templates to machine self-learning and template optimization after the introduction of deep learning, the “script writing robot” itself is constantly evolving.

Robot intervention in manuscript writing first started in the old American newspaper “Washington Post”. As early as the end of 2012, The Washington Post launched a real-time news verification project called “truth teller”. It can record the text, voice and other information in news reports in the whole process, and then compare it with the “anti-counterfeiting” database, and send an alarm once an abnormality is found.

1 2 3 4 5 Next > page

Beginning in 2015, “script writing robots” of Chinese and foreign media entered the page and began to establish their own names. “New York Times” Blossom, “Washington Post” Truth Teller, “Los Angeles Times” Smart Inline Template, “Guardian” Open001, Reuters’ Open Calais, and Associated Press’s Wordsmith six international top media set up their own robot services system.

In China, Tencent took the lead in launching the Dreamwriter writing robot in August 2015. In the following year, today’s headline xiaomingbot, Yicai’s DT drafter, and Baidu’s secret commentary surfaced one after another. Tencent, Alibaba, Baidu, Toutiao, the situation of separatism of the four parties has officially formed.

In the information flow market, Yicai can represent Alibaba’s strategic layout. In 2015, Ali purchased 30% equity of China Business Finance Group for 1.2 billion yuan, and then transferred its big data value mining expert Dr. Tang Kaizhi to Yicai, as its chief data scientist, and its automatic/assistant writing products. provide technical support.

2. Robot competition draft of the four major platforms

1) Product features and application status In order to present a more concrete manuscript writing robot market, the author focused on the performance of four products from Tencent, Toutiao, Baidu, and Yicai in terms of user interface, content presentation, number of manuscripts, and efficiency. , and compare and analyze the ideas and characteristics of each company in product layout and application fields.

The coverage areas are mainly sports events, and financial news, and most of them produce event reports and newsletters in a short, flat, and fast way. Among them, Tencent has the widest coverage of news content, and its product docking platforms include Tencent Finance, Tencent Technology, and Tencent Sports. Due to its stronger media attributes and extensive product channels, CBN Group has a wider range of content distribution, including media products, WeChat content push, TV news, etc. It can be seen from the above chart analysis that the above-mentioned four “writing robots” products are still in the stage of self-development and self-use, and are mainly used for content production and distribution on their own media platforms. In addition, Tang Kaizhi, Chief Data Scientist of Yicai, said to Zhixi that its DT Draft King products are now also applied to the information section of the e-commerce platform “Qianniu”.

In terms of reporting form, event reports are more inclined to combine pictures and texts, and pictures are automatically matched. Of course, when targeting different terminals and products, the reporting style may be adjusted. For example, in Tencent Sports mobile terminal, event reports are presented in plain text; while in Tencent News client, the complete graphic content is retained.

In terms of the production volume, each company did not give specific values. The author combined the number of page presentations and reference values ​​to make statistics. Because of its widest coverage, Tencent has a certain advantage in the amount of effective manuscripts. “The mechanism of machine writing is to write on a large scale, and finally use manual editing, CMS (Content Management System, content management system)”, introduced by Liu Kang, head of Tencent’s content robot project and deputy director of Tencent Finance.

Yicai.com mainly uses stock market changes as the entry point for news, and reports more frequently. “The use of long stories is relatively infrequent, at most one per day, or one per month.”

It should be noted that the Baidu Dubi platform, which is only used for “event commentary”, is included in the analysis object, the main reason is that the industrialization degree of Dubi’s “real-time graphic content presentation + audio broadcast” has reached the level of machine writing. The technical principles behind it are also very similar. Perhaps the Baidu platform itself lacks media attributes, or it is not suitable for self-produced content as a content distribution platform, and Baidu does not present it in the form of news at the front end.

2) Analysis of user interface features

The above are respectively the different interfaces presented by calling the “7*24-hour Kanban” of Yicai, the “Dumi Live Basketball Game” of the Dumi APP, the author’s column of today’s headline “Xiaomingbot”, and the Tencent News search for “Dreamwriter”. User-side comparison.

page

In terms of report form and content richness, machine writing is no different from humans.In terms of title processing, words such as “Beat the Bucks”, “New High”, “Wizards” and “Regret” have been separatedmechanicalThe scores are presented in a style with personalized media reporting attributes.

Scrolling stock financial newsletters emphasize timeliness and data accuracy. In this dimension, machines outperform humans.

3) Analysis of content presentation characteristics

In terms of report form and style, each company has presented it differently according to the characteristics of users. Tencent Dreamwriter and Today’s Toutiao Xiaomingbot report in the form of a combination of pictures and texts. Xiaomingbot is richer in the presentation of pictures and presence, while Dreamwriter emphasizes scores and details. In the framework of the article, both of them distill the highlights of the event and the overall situation, rather than simply stating the data.

Yicai, as a more professional and vertical media platform. In addition to the presentation of Chinese content, it has also been processed into English. “A large part of the Chinese financial templates are translated into English templates, which saves the cost of content translation and can be used globally at Yicai,” Dr. Tang Kaizhi introduced.
Baidu Du Mi’s commentary is presented in the form of a dialog box, which can provide users with real-time live broadcast of the game, and is accompanied by some animations and audio.

In general, in the new business of writing robots, the technology and products of the three BAT companies and Toutiao are not far behind. It is a pity that Baidu has not integrated the technology behind it into a complete media product, and many people do not know about Dubi’s “event live broadcast” function.

The reason behind this may be that Baidu itself lacks media platform genes, focusing on users’ passive search and advertising business. However, on the other hand, the Baiduization of Toutiao is faster than expected.

3. The layman is watching the excitement and the layman is watching the doorway

In the communication with Dr. Tang Kaizhi of Yicai, he believed that the research of machine writing mainly revolves around three typical modes, “logic from shallow to deep, from precise to vague”: for a person who once regarded “artificial intelligence threat theory” as a joke , and suddenly found that the robot has invaded the field that he is good at, and he was somewhat surprised and panicked. However, the greater value of artificial intelligence is that it can be used by humans after understanding. “I personally prefer a neutral cognition. Machine writing can indeed replace part of the manpower, but it is only redundant and low-tech manpower,” said Liu Kang, deputy director of Tencent Finance and Economics.

1) The first category is articles that state facts based on numbers and conduct simple logical analysis, such as monitoring of the secondary market and newsletters of sports events;

2) The second category is to extract targeted information from the information