AIMultiple ResearchAIMultiple Research

Natural Language Generation (NLG) in 2024

Cem Dilmegani
Updated on Jan 11
5 min read
Natural Language Generation (NLG) in 2024Natural Language Generation (NLG) in 2024

Artificial intelligence is disrupting industries with various use cases, and content automation is one of those applications. Natural language generation (NLG) is the AI technology behind text content automation with its capability to convert data into words, sentences, articles and even film scripts.

In this article, we highlighted all important aspects of NLG, including why it matters, how it works, challenges, applications & applicable areas.

What is Natural Language Generation?

Natural Language Generation (NLG), a subcategory of Natural Language Processing (NLP), is a software process that automatically transforms structured data into human-readable text. 

Using NLG, businesses can generate thousands of pages of data-driven narratives in minutes using the right data in the right format.NLG is a subcategory of content automation focused on text automation.

Why is Natural Language Generation important?

Around 35% of customers read blogs and websites before deciding which products to buy. For many e-commerce and retail firms, it is difficult to generate content for each product manually. NLG technology can automate this process. Thus, improving the overall marketing/sales efforts of the companies.

NLG market also has potential because:

How does NLG work?

An automated text generation process involves 6 stages. For the sake of simplicity, we’ll explain each stage from an example of robot journalist news on a football match:

1. Content Determination

The limits of the content should be determined. The data often contains more information than necessary. In football news examples, content regarding goals, cards, and penalties will be important for readers. 

2. Data interpretation

The analyzed data is interpreted. Thanks to machine learning techniques, patterns can be recognized in the processed data. This is where data is put into context. For instance, information such as the winner of the match, goal scorers & assists, and minutes when goals are scored are identified in this stage.

3. Document planning

In this stage, the structures in the data are organized to create a narrative structure and document plan. 

Football news generally starts with a paragraph that indicates the score of the game with a comment that describes the level of intensity and competitiveness in the game, then the writer reminds the pre-game standings of teams, describes other highlights of the game in the next paragraphs, and ends with player and coach interviews.

4. Sentence Aggregation

It is also called micro planning, and this process is about choosing the expressions and words in each sentence for the end-user. In other words, this stage is where different sentences are aggregated in context because of their relevance. 

For example, below, the first two sentences provide different meanings. However, if the second event occurs right before half time, then these two sentences can be aggregated like the third sentence:

  1. “[X team] maintained their lead into halftime. “
  2. “VAR overruled a decision to award [Y team]’s [Football player Z] a penalty after replay showed [Football player T]’s apparent kick didn’t connect.”
  3. “[X team] maintained their lead into halftime after VAR overruled a decision to award [Y team]’s [Football player Z] a penalty after replay showed [Football player T]’s apparent kick didn’t connect.”

5. Grammaticalization

The grammaticalization stage makes sure that the whole report follows the correct grammatical form, spelling, and punctuation. This includes validation of actual text according to the rules of syntax, morphology, and orthography. For instance, football games are written in the past tense.

6. Language Implementation

This stage involves inputting data into templates and ensuring that the document is output in the right format and according to the preferences of the user. 

Top 7 use cases of Natural Language Generation?

Since NLG aims to make sense of the data and create human-readable insights, it can be applied to all areas dealing with reporting, content creation, and content personalization.

1. Retail & Wholesale

NLG solutions can provide product descriptions and categorization for online shopping and e-commerce and help personalize customer communication via chatbots. Steven Morell, CRO of AX Semantics, is explaining how an e-commerce site can automate their product description writing process with AX Semantics‘ NLG tool.

2. Banking & Finance

The banking industry highly relies on data and insights for performance reporting. Additionally, profit and loss reports can be automated using NLG systems. NLG techniques can be used to support fintech chatbots that interact with customers for personal financial management advice.

3. Manufacturing

As IoT applications are implemented more widely in production sites, they generate a significant volume of data useful for performance improvement and maintenance. NLG can automate the communication of important findings such as IoT device status and maintenance reporting so employees can take action faster.

4. Media

NLG solutions can aid summarization and content creation. Especially sports and financial news (also called robot journalists) tend to follow similar templates, and text explaining such events can be easily created. 

For more information on robot journalists and other AI applications in media, feel free to check our related article.

5. Insurance

NLG solutions can help to improve the communication of personalized plans for customers.

6. Transportation

Chatbots can deliver alerts about delays and schedules. NLG tools can be used to create personalized, easy-to-read travel plans.

7. Politics

Probably the most dangerous use case is using NLG solutions to spread personalized propaganda and misinformation. Unfortunately, this is the risk of making the current flow of political disinformation even more dangerous and personalized.

What are real-world content automation examples thanks to NLG?

Here are some real-world content automation examples using NLG:

  • GPT-3 is a language model developed by OpenAI. Here is an article on “Robots Come in Peace”  which is written by GPT-3, OpenAI’s language generator. Though GPT-3 creates well-written narratives, it lacks logical understanding, which makes its articles prone to error.
  • LaMDA is Google’s language model for dialogue applications launched mid 2021. It was trained on large volumes of data and was introduced to public crowds as an AI that pretended to be Pluto, and a paper airplane.
  • Wu-Dao is China’s “improved” version of GPT-3 trained on 4.9 terabytes of high-quality images and texts in both English and Chinese. It is capable of generating both text and images and was introduced to crowds in the form of a virtual student capable of writing poems, drawing, and composing music.
  • In 2019, Springer published its first machine-generated book.
  • Gmail’s Smart Compose provides recommendations on what should be typed next in an email. It also learns from your selections to enhance the recommendation algorithm for upcoming emails.
  • The paraphrasing tool QuillBot that uses NLG.
  • All conversational AI/ chatbot applications are also examples of NLG.

News

  • The Associated Press uses NLG to create corporate earnings reports automatically.
  • The Washington Post is using their in-house automated storytelling technology, called Heliograf, to cover all Washington, D.C.-area high school football games every week.
  • This is a showcase of a website with all football and ice hockey, in Sweden. All articles about every game, from kids’ games to the top leagues, are written by Lingmill’s text robot.

What are the challenges of content automation with NLG?

1. Data availability and quality

Automated contents require high-quality structured data. Therefore content automation fits well in areas such as finance, sports, or weather, where data providers make sure that data is accurate and reliable.

2. Originality & Writing quality

Natural language generation is limited to providing answers to prewritten questions by analyzing the given data. Algorithms cannot ask new questions, detect needs, recognize threats, solve problems, or give their thoughts and interpretation on topics such as social and policy change.

Thanks to machine learning and data augmentation techniques, the quality of NLG content is likely to keep improving. However, auto-generated articles tend to be less original than human-written ones.

3. Bias

NLG algorithms rely on data and assumptions. AI bias can create prejudiced algorithms and outcomes.

Feel free to check our article if you want to learn more about biases in AI algorithms, including types, examples, best practices & leading tools to reduce bias.

If you have questions on Natural Language Generation vendors, feel free to check our sortable, regularly updated list of NLG companies or contact us:

Find the Right Vendors
Access Cem's 2 decades of B2B tech experience as a tech consultant, enterprise leader, startup entrepreneur & industry analyst. Leverage insights informing top Fortune 500 every month.
Cem Dilmegani
Principal Analyst
Follow on

Cem Dilmegani
Principal Analyst

Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 60% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE, NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and media that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised businesses on their enterprise software, automation, cloud, AI / ML and other technology related decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.

To stay up-to-date on B2B tech & accelerate your enterprise:

Follow on

Next to Read

Comments

Your email address will not be published. All fields are required.

0 Comments