News
Data governance: the challenge of organizing data before scaling up AI
Data governance has become one of the key factors determining artificial intelligence's ability to generate real value.
According to the INE, 21.1% of Spanish companies with 10 or more employees already use artificial intelligence, nearly nine percentage points more than a year ago. However, this adoption does not always translate into results: achieving a clear return on investment in AI remains one of the main challenges facing management teams.
When an artificial intelligence project fails to deliver the expected results, it is common to point to the sophistication of the model or the technical capabilities of the infrastructure as the problem. However, the main obstacle often lies not in the algorithm, but in the governance of the data that feeds it. Without a data governance framework to organize and validate the information, scaling these types of models risks amplifying errors and losing sight of business objectives.
Fragmentation and a lack of an analytical mindset: the first obstacles to AI
Over the years, many companies have organized their data into separate silos: CRM systems for sales, databases for finance, external platforms for marketing... This fragmentation results in a lack of clear accountability and differing definitions of key concepts.
Let’s take a seemingly simple term as an example: the “active customer.” For the marketing team, this might be a user who has recently interacted with a campaign; for finance, it could be someone with paid invoices; and for sales, it might be someone with a current contract. If a company trains an AI model to predict customer churn using data that blends these perspectives without a unified standard, the algorithm will have to process conflicting signals. But the problem won’t be a model failure—it will be decisions based on poorly defined data.
Added to this is the fact that, according to the European Commission, one-quarter of EU companies store data but do not analyze it. This fragmentation and lack of an analytical culture explain why, when attempting to make the leap to AI, many companies discover that the foundations of their information systems are weak.
What a data governance framework should include in the age of AI
To address these shortcomings, it is necessary to implement a data governance framework that goes beyond a static inventory of databases. It must include dynamic policies, roles, and processes that ensure information is reliable, accessible, and traceable at the speed required by AI. With this approach, the three traditional pillars of data governance take on a much more strategic dimension:
- Results-oriented Data Owner: validates whether the data is suitable and secure for training a model and takes responsibility for the impact of AI outputs on the business. Augmented analytics provides tools to detect and correct inconsistencies before they compromise the models.
- Data Steward focused on continuous data flow: their role is to monitor information flows in real time to prevent data from changing in nature or losing quality, which would ultimately compromise the reliability of the AI model.
- Data Catalog as a project accelerator: it enables innovation teams to quickly and accurately determine which datasets are ready, validated, and secured to power an AI model.
This structure requires strict control over data lineage and quality. In the age of AI, traceability can no longer be addressed through a post-hoc audit: it requires tracking the entire journey of the information from its source to the algorithm that processes it, in order to identify and correct issues before they impact business decisions.
A Roadmap for Implementing Data Governance in AI
Data governance does not necessarily require a large-scale organizational transformation. An incremental approach, focused on priority AI use cases, can deliver value from the outset without disrupting business operations.
First, it is important to identify the company’s critical domains and select the specific datasets that support these use cases. This exercise is particularly necessary in sectors such as healthcare, finance, and manufacturing, where industry-specific AI models are already beginning to make a difference.
The next step is to assign clear responsibilities: each data domain should have a Data Owner in the corresponding business unit, rather than centralizing responsibility for data quality within the IT department. This involves aligning business and IT objectives and clarifying who makes decisions regarding the data.
From there, it is advisable to define quality thresholds using simple, automated rules that assess data quality before it enters the algorithm, based on factors such as format consistency and the absence of duplicates. This step makes it possible to identify which data assets are ready to feed into a model and which require prior cleaning, in order to properly scale the project.
At the same time, it is advisable to build an incremental catalog that documents information assets progressively, starting with those that are already in production. The ultimate goal is to integrate governance into the workflow so that data validation and oversight occur automatically within the company’s processes. Once this point is reached, data governance ceases to be an isolated initiative and becomes part of standard operations.
Data governance takes center stage at Big Data & AI World 2026
Data governance need not be seen as an obstacle to digital transformation; rather, it is one of its key drivers. Attempting to scale up artificial intelligence without first organizing information assets can compromise the reliability of results in ways that are difficult to detect afterward. Companies that view governance as a strategic advantage are better able to mitigate risks, comply with regulations, and achieve a faster return on their AI investments.
To explore these methodologies in greater depth and discover how other companies are addressing these challenges, the upcoming edition of Big Data & AI World 2026 will focus on advanced analytics and data governance, among other current issues in the sector. This event, which is part of Tech Show Madrid, has become the leading forum for innovation in data, advanced analytics, and artificial intelligence in Spain.
On November 4 and 5, Big Data & AI World will bring together more than 40 leading exhibitors, 70 international speakers, and 4,000 trade visitors at IFEMA Madrid, 37% of whom are C-level executives such as CEOs, CDOs, CAIOs, and AI directors. It will be a unique opportunity to hear from industry leaders, anticipate trends, learn about real-world case studies, and connect with the partners who are shaping AI models.
HR Technologies
Learning Technologies)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)