AIMultiple ResearchAIMultiple Research

Understanding Data Mesh: The key features & principles in '24

Gulbahar Karatas
Updated on Jan 2
3 min read

Data environments and data sources are constantly changing. New data is generated in every business activity and there is a need to:

  • analyze new data generated in the different domains and regulate access to each data product
  • apply organizational policies to data products in these domains
  • make this data consistent.

It is important and necessary to adopt data-oriented approaches, analyze data networks and use them for operational activities. Data lakes and data warehouses are both used for storing large amounts of data. However, they can fall short in meeting the requirements of modern organizations in terms of agility since both have limitations in storing different types of data structures.

What is data mesh?

Data mesh is a new decentralized data architecture approach created by Zhamak Dehghani of ThoughtWorks. It makes data highly available, easily discoverable, and secure for end-users. It manages and supports access to data before it is transferred to a data lake or data warehouse, and connects distributed data across different locations. It has a decentralized structure that connects data that is spread across different physical locations. It allows end-users to manage data stored in various locations. Other advantages of a decentralized structure include: 

  • More secure compared to a centralized database.
  • It can be easily accessed from different networks.
Source: Zhamak Dehghani

What are the main principles of data mesh architecture?

  1. Data as a product: Data as a product principle is concerned with data quality and data silos issues which makes data that is held by one group not fully accessible to others in the same organization.
  2. Self-serve data platform: It provides:
    • data products schema
    • data catalog
    • storage
    • data pipeline that processes data in sequence, the output of one element is the input of the next one.
  3. Domain-oriented decentralized data ownership and architecture: Giving ownership of data to data producers is not always the ideal solution. Instead of data ownership of data by a data warehouse / lake team, ownership of the data is given to those who know the domain best.
  4. Federated computational governance: It is a set of rules and a decision model that can be applied to independent data products and ensure their interoperability.

What are the key features of the data mesh architecture?

  • All data is a domain that can be accessed by anyone who has access to it.
  • Decentralized ownership/ data management.
  • Data as a product, not as a by-product.
  • Each domain is responsible for its data, its data quality, and its security.
  • Domains don’t influence each other, each domain has its own resources.
  • Domains are owned by those who know data best. It creates decentralized data ownership.

How data mesh differs from data warehouse and data lake?

A data warehouse is generally used for storing processed data while a data lake is used for storing raw data to support machine learning. Data mesh, on the other hand, corrects the incongruence between the two as a different approach from the data warehouse and data lake.

  • A data warehouse, unlike data mesh, has a centralized architecture. However, this leads to some problems, such as the lack of control over data. Data mesh gives control over the data to the person who creates it. This approach creates a separate data product manager for each domain and reduces dependency on others.
  • A data lake is a centralized data repository that organizes and protects structured, semi-structured, and unstructured data from various sources. Data mesh, on the other hand, is a distributed architecture and encompasses data activities from producers to consumers. Data mesh architecture connects different data sources including data lakes and creates a unified and coherent infrastructure.

What is the difference between data mesh and data fabric?

In a data mesh, data is treated as a product and data mesh has a decentralized (or distributed) structure. Data is stored in each of the different domains. Data fabric provides a single point of control for the data because data access is centralized. According to James Serra; ” A data fabric and a data mesh both provide an architecture to access data across multiple technologies and platforms, but a data fabric is technology-centric, while a data mesh focuses on organizational change.”

Further Reading

If you have questions, we would like to help:

Find the Right Vendors
Access Cem's 2 decades of B2B tech experience as a tech consultant, enterprise leader, startup entrepreneur & industry analyst. Leverage insights informing top Fortune 500 every month.
Cem Dilmegani
Principal Analyst
Follow on

Gulbahar Karatas
Gülbahar is an AIMultiple industry analyst focused on web data collections and applications of web data.

Next to Read

Comments

Your email address will not be published. All fields are required.

0 Comments