AIMultiple ResearchAIMultiple Research

Generative Adversarial Networks (GAN) & Synthetic Data [2024]

Generative Adversarial Network (GAN) is a type of generative model based on deep neural networks. You may have heard of it as the algorithm behind the artificially created portrait painting, Edmond de Bellamy, which was sold for $432,500 in 2018. Apart from their artistic capabilities, GANs are powerful tools for generating artificial datasets that are indistinguishable from real ones.

How are GANs used in creating synthetic data?

Like any other generative model, GANs aim at learning the distribution of a training dataset to generate new (synthetic) data instances.

A GAN model is made up of two sub-models:

  • Generator part generates new data instances from random input.
  • Discriminator part is trained on both real data and fake data (from the generator). Then, it evaluates whether the input data is real or fake.

These two sub-models work against each other: the discriminator learns to get better at distinguishing the generated fake data and real data and the generator learns to generate more realistic data points. This process continues until the generator can create data instances that the discriminator cannot distinguish from real data. 

Source: Mathworks

What types of synthetic data can be generated with GANs?

GANs are pretty versatile in terms of the data type they can work with:

  • Images: Realistic images of faces, objects, handwriting, etc. Here’s a tweet from the inventor of GANs on how their face generating capabilities improved over time:
  • Videos
  • Audio
  • Tabular data
  • Time Series: Synthesizing convincing time-series data is challenging because the model should generate data points that depend on many other past data points.

What are alternatives to GANs for creating synthetic data?

GANs are not the only generative model based on deep learning. Other types of generative models include:

  • Variational autoencoder: Variational autoencoder is an unsupervised deep learning model that converts input data to an encoding vector containing probability distributions for the attributes of the input data. Then, it decodes these distributions to create new data.
  • Deep autoregressive models: Deep autoregressive models are generative supervised models for sequential data. They generate new data points by taking past values of data as model inputs.

There are also other methods for generating synthetic data not based on deep learning such as Monte Carlo method. The choice for the method depends on the context and your needs. For more information on various methods, check our comprehensive guide on synthetic data.

When you are ready to work with synthetic data, you can find a prioritized, data driven list of synthetic data technology companies.

If you need help in choosing vendors for your synthetic data needs, let us know:

Find the Right Vendors
Access Cem's 2 decades of B2B tech experience as a tech consultant, enterprise leader, startup entrepreneur & industry analyst. Leverage insights informing top Fortune 500 every month.
Cem Dilmegani
Principal Analyst
Follow on

Cem Dilmegani
Principal Analyst

Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 60% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE, NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and media that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised businesses on their enterprise software, automation, cloud, AI / ML and other technology related decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.

To stay up-to-date on B2B tech & accelerate your enterprise:

Follow on

Next to Read


Your email address will not be published. All fields are required.