The Data Governance Strategy Retailers Need to Prepare for Gen AI
Retailers are moving fast to incorporate generative artificial intelligence (Gen AI) into every corner of their operations. However, there's one major obstacle standing in the way: their data.
While most data teams are skilled at managing traditional retail data (e.g., transactional, inventory, etc.), many of the most promising Gen AI use cases rely on unstructured data: think everything from product descriptions and user reviews to social media posts and customer service call transcripts.
Unfortunately, the sheer volume of data available — and the pace at which it’s generated — can be overwhelming to account for and make sense of. In the era of Gen AI, though, success hinges on the ability to harness this data sprawl to fuel models that drive real-time decisions.
This shift demands a new kind of data governance strategy, one that supports both single-model use cases and "composite AI" solutions that leverage multiple models to tackle complex problems. More specifically, this strategy demands a mindset that integrates data as a core component in every business process. Here's what that looks like.
Focus on Activating Opportunities, Not Just Managing Risk
Historically, data governance in retail was focused on risk mitigation. Data stewards aimed to limit access as much as possible in order to protect customer data, comply with privacy laws, and avoid breaches. In many cases, it was better for employees not to touch internal data at all.
For retailers with Gen AI ambitions, however, this approach won't cut it anymore. AI needs rich, curated data to yield the best results. To get their data in shape, retailers must shift from a defensive to an enabling posture, investing in the governance infrastructure needed to curate, structure and activate the data they have.
Build a Well-Rounded Operating Model
What does modern data governance actually look like in this context? It starts with six technical pillars:
- Discoverability: Retailers need to create robust catalogs and ontologies to surface the full range of available assets and make it understandable to AI systems. This is crucial for tracing an asset lineage (i.e., where each asset comes from, how it was transformed, and whether it’s suitable for a given AI use case). It’s also necessary for effective AI. After all, the more discoverable your assets, the easier it is to provide AI with context-specific information that boost the accuracy of each output.
- Management: Consistency and accuracy is crucial for maintaining quality. Multiple versions of each asset, for instance, can result in extra maintenance effort, outdated information, and poor decision-making. An approach like master data management can keep retailers from having several duplicate records and make it easier to curate the data needed for personalization and analytics.
- Literacy: Employees need to understand how data functions within the broader business context, and how manipulating (or deleting) data can have significant downstream effects, particularly on the models being trained. Key to fostering this understanding is specific, human-centric language that highlights the business outcomes. The better employees get to know their data, the more easily they can curate the information they need to answer core business questions.
- Security and Protection: On top of conventional data privacy and compliance concerns (e.g., data anonymization and PII protection), it's also important to ensure data is handled responsibly in AI contexts (e.g., by anonymizing where possible and mitigating bias, hallucinations, and toxicity when training models).
- Ownership: In many organizations, multiple teams generate data that flows into centralized lakes or warehouses. However, without clear lines of responsibility, data stewardship falters. Establish who is accountable for different types of information (e.g., HR data, product data, customer data, etc.) and define roles and expectations. The numbers tell a compelling story: Supply chain leaders with top-tier analytics use data stewards 48 percent more often than their competitors.
- Change Management: A well-defined process for handling change — across processes, people, and data — will keep your governance wheel running properly. With change management baked in, your governance strategy will be resilient, not rigid, and will be able to adapt as your AI ambitions grow more complex.
These pillars create the stable foundation needed for both single-model and composite AI use cases. After all, just like you can’t build a house with an Allen wrench, you can’t solve many of the most complex retail problems with a single AI model. You’ll want a full set of power tools (i.e., multiple models) to see the best results. And without proper maintenance (i.e., effective governance), your tools and your project will start to fall apart.
Of course, the technical aspect is just one piece of the puzzle. An effective data governance program should also focus on establishing models that account for each company's people, culture and workflows. By meeting people where they are, you can curb resistance to change before it takes root.
In practice, that effort means gradually introducing new ways of working to minimize disruption. It also demands ensuring less data-savvy employees don't get left behind. After all, retailers are generating and handling so much data that it can easily overwhelm workers. It's important to make sure they understand what the data they have actually means and how to apply it day to day, whether with AI tools or other software.
Connect Good Governance to Clear Business Outcomes
Although there's a clear case for better data governance in the age of AI, many leaders often meet a fair degree of internal resistance to change. Teams may view governance as "compliance theater" or fear it will slow down innovation.
The key to overcoming this resistance? Make a crystal-clear business case.
We recommend tying governance initiatives directly to outcomes like the following:
- Productivity: With better data governance, logistics teams can use AI to quickly flag and resolve supply chain inefficiencies — a process that might take months to do manually.
- Personalization: Accurate, consistent customer profiles are the foundation for AI-powered recommendations that can turbocharge your marketing, tailor product descriptions, and help you customize programs.
- Future Readiness: Paint a picture for how commerce will evolve with data-powered AI and focus on how these use cases will position your business to stay competitive for years to come.
Evolving your data governance model will require confident leadership. It will also demand bottom-up adoption. Everyone involved in creating or using data — from marketers to analysts to brick-and-mortar managers — needs to buy into the vision and its promise to ensure long-term success.
Data Governance is Key to Retail's AI Future
Gen AI holds enormous promise in retail. However, every model, no matter how advanced, is only as good as the data it's trained on and fueled by. As the saying goes, "garbage in, garbage out" — that is, bad data is bound to yield bad outputs.
That's why it's time to think of data governance as a business imperative. By doing so, retailers can tap into the full value of their data and, in turn, their AI.
The future of retail will be AI-powered, but only for those with the data foundation to support it. The time to build that foundation is now.
Daniel Vieira Viveiros is the senior vice president, data and analytics at CI&T, a global technology transformation specialist.
Related story: Retail’s AI Ambitions Are High. Their Data Readiness Isn’t
Daniel Vieira Viveiros is the senior vice president, data and analytics at CI&T. A global technology transformation specialist for 100-plus large enterprises and fast-growth clients, CI&T helps retailers engage customers, increase sales, and drive greater operational efficiencies.





