There is a lot of talk about the value of data being the new gold. Many companies are therefore pouring large sums of money into becoming data-driven. It is sold as a completely new way of doing business, almost as if business has never been driven by data before.
Sure, many companies struggle with being fully data-driven, but I think the bigger issue is, that it’s not at all sufficient.
We all seek to extract something magical from the data. That’s why we collect data in big data lakes, data warehouses and all the other big data collections that have not yet received a catchy marketing label. I believe we all fall into the trap of wanting to unearth a treasure trove of data. We somehow can’t escape the thinking that these massive data collections are the foundation of some analytical superpower that will ultimately propel our businesses to new heights.
Data technology and Data Engineering teams are then often seen as the decisive factor in converting data into business benefits. While technology and engineering are undeniably important, it takes a more holistic approach to truly transform a business into a digital powerhouse.
In this article I will approach the matter from the business side and explain why simply being data-driven isn’t enough. Believe me, you’re not going to take your company to the next level by creating a ‘data team’ to do some magic with your data.
I will explain how business should actually drive IT technology and why current data strategies often fall short in unlocking the full potential of data-empowered companies.
Business View
The functioning of a company can be understood as the interaction of many individual business processes to form a coherent whole. Each business process makes a small contribution to the realization of the company’s value proposition, which is usually the provision of products and services. Business people execute the processes on behalf of the organization based on their knowledge and available process descriptions. The customer finally receives the result as the product or service requested.

While this looks simple at first sight, the enterprise is rather a complex adaptive system that intensively interacts with the outside world. The internal processes are numerous and never static. They constantly evolve based on decisions and actions from the employees. Employees are supported by applications that partially or completely digitalize the business processes. Data as intermediate output of digitalized business processes is extensively shared in the company and with the outside world via channels.
In fact, data itself is distributed across the entire organization. It is kept private inside the business processes, managed solely by the process itself or the process owner. Data is to be shared via channels and comprises all information needed by other processes to fulfill their business goals.
Hence, we have private business data on the inside of processes and public data that needs to be shared with other applications.
Data is a second thought in business
The business is typically completely process driven with data solely used as a means to transfer information. A data channel has the one and only task to enable the exchange of information.
The business world mainly thinks in terms of processes, not data. You will find tons of information about business process reengineering, but the discipline business data reengineering doesn’t even exist.
Sure, we have analytical processes that require a lot of input data. All of that data needs to modeled and structured, but business thinking is always driven by processes. We have to fulfill a customer order, comply with official regulations, prepare a balance sheet and income statement or train a model to automatically determine the price of a product to be sold. There is a process behind each and every data requirement.
The fact that we have business processes all the way through the enterprise should from my point of view drive IT architecture and, in particular, the discipline of data engineering. Let’s face the truth and recognize that data has no value if it is not required for a specific business process. I think it is the wrong idea to collect data in ever larger data repositories without a clear business process requirement.
And to be clear, the wish ‘give me all data available in the enterprise’ to be able to do ‘something intelligent’ with it, is not that clear business requirement.
What does data-empowered mean?
Data in itself has no value. It’s merely digitalized information.
It can drive the business forward if we are able to provide each business process with exactly the data it needs. And it can even empower the business if we are able to derive insight and ultimately knowledge from it to improve the way we are doing business.
Data therefore seems to serve two purposes: To enable business operations and to form the basis for analytical evaluations that enable decisions for improving and further developing business.
What does it then mean for a company to be data-empowered?
A data-empowered company uses its data to optimally operate the business processes, continually refining them through informed decisions backed by advanced data analytics.
A company that is only concerned with maintaining the status quo would soon bleed to death. This means that analytical processes are at least as important as the operational ones. We are currently witnessing a surge in artificial intelligence deployment to develop smarter applications with ever more potential for accelerating the digital transformation. The importance of analytical processes, including leveraging machine learning to train intelligent models from data, will inevitably continue to grow.
However, these processes can tolerate longer periods of downtime. A temporary inability to make intelligent decisions and enhance business processes won’t immediately jeopardize the company’s survival. Therefore IT technically distinguished between operational processes and the less critical analytical processes.
Due to this technical distinction, our systems and the data have been categorically divided into operational and analytical realms. However, from a business perspective, this rigid technical separation makes no sense as both processes and their results are equally essential for the company to thrive.
In practice, there’s a continuous and dynamic interaction between operational and analytical business processes. For example, a data warehouse may supply data to train a machine learning model, which then generates key input (e.g. the price for a product) for an operational sales process. This process, in turn, can produce valuable customer behavior data for further analytical analysis.
Business people don’t differentiate between operational and analytical data. While we can classify processes as having operational or analytical purposes, data must seamlessly flow between all of them.
Data empowers business when it is accessible and usable across all business processes, whether analytical or operational.
From processes to data
A process model can always be used to distill a data model from it. To better understand this, it helps to consider the analogy between a business process and the application that actually digitalizes this process.
We said that the functioning of a company can be understood as the interaction of all business processes. Consequently, the sum of all applications (or microservices including orchestration) can be understood as the digitalized subset of all business processes in a company.
Business processes interconnect by exchanging information modeled as business data. For business data to be usable, consuming applications require rich information about the context and the data lineage. However, if we digitalize only the basic information without the business context and provenance, it is extremely difficult to interpret the data.
In an analogue world we would then call the process owner and ask for the missing business context. In the digital world the consuming application must be able to directly extract everything from the business data needed to fulfill its goal. The fundamental issue to be solved by data engineering is that business data should be able to exist independently of its generating business processes without any loss of business context information.
The current practice of preserving business context for data is to create separate metadata and enterprise data models. However, metadata is often incomplete and kept separated from the data to be explained. And data models are often static and at best high-level representations buried in technical modeling tools rarely up-to-date to what is really going on in a living enterprise.
The other practice is to keep data encapsulated in applications, so that consumers can query the source application for any missing context information. However, data is fundamentally different from applications and we should avoid to deliver data as an application.
From the business perspective, it’s all about the correct digitalization of the entire business context, which is necessary to seamlessly link all business processes to a functioning whole for the enterprise.
Business drives technology
If we are serious about the business to drive IT technology and data engineering, we need to support the process view as comprehensively as possible. This means that data is only the enabler for the applications that are the first-class citizens in enterprise IT technology.
Data engineering is crucial to enable the lossless exchange of the entire business context between applications in the company and beyond. Modern data technology can of course also be used to implement business logic. But this logic must be owned by the business people and their application developers. Data engineers must implement the data channels with the sole task of enabling lossless information exchange between all these applications.
Universal data supply exactly addresses this important requirement with a Data Mesh. It organizes the company-wide distributed data flow between all applications without the loss of context, with minimal delay and without the need for centralized data repositories separated from the operational world.
This requires a completely different approach compared to traditional data architectures such as data warehouse or data lake(house), which are mainly about extracting and gathering data from applications, but where the business context and data provenance is lost.
I have described an approach that fulfills universal data supply based on the original data mesh principles, but with important adjustments. Don’t implement a data mesh without these adjustments to ensure that data truly empowers business.
Universal data supply empowers business with data and is characterized by these principles:
- Business drives technology. Technology supports and empowers business.
- Applications implement business logic defined by the business processes. They serve both operational and analytical purposes in a well integrated way – analytical processing is in any case a matter for the application developers and by no means just for the data team.
- Public data, modeled following modern principles, is the means to exchange information between all applications.
- All public data must flow seamlessly between all applications and be self-sufficient, i.e. it must encapsulate the entire business context including its provenance at all times.
If you find this information useful, please consider to clap. I would be more than happy to receive your feedback with your opinions and questions.