Critical task for online retailers: Clean up product data

Poor quality product data routinely has serious consequences for retailers. Left unchecked, bad data hinders the efficiency of business operations, product search and discovery, customer satisfaction and sales.

Bad product data, often hidden in plain sight, can critically affect retailers’ bottom lines. According to information technology company Gartner, poor data quality costs an organization $12.9 million per year on average. This increases the immediate impact on income in the long run. increasing data complexity In addition to ecosystems, bad data leads to bad decisions.

To make the impact of bad data on retailers more visible, SaaS-based e-commerce platform GroupBy hosted a webinar in September with Google Cloud partner Sada and e-commerce firm Rethink Retail. The event, titled “Bad Data, Big Problem: How to turn around poor product data”, looked at how companies can use AI to enrich data, improve search relevance and product discovery, increase customer satisfaction, reduce operational costs and increase returns.

The key to this level of success is to analyze the quality of product data and identify areas for improvement. Best practices include creating a standard data collection model, conducting regular reviews, and implementing AI-based solutions to automate the cleaning, standardization, and optimization of product data at speed and scale.

Thus, enriching data with artificial intelligence can improve operational efficiency, drive growth and improve brand reputation. According to Arvin Natarajan, chief product officer at GroupBy, poor product data plagues almost every retailer today, impacting every application that depends on performance.

“Long-term insufficient data negatively impacts the customer experience and ultimately your bottom line,” he said.

Sophisticated generative AI models trained on GroupBy’s proprietary global taxonomy library can identify common data problems and revolutionize product data matching and management.

Using artificial intelligence in cloud-based product discovery

Powered by Google Cloud Vertex AI, GroupBy’s e-commerce product search and discovery platform offers retailers and wholesalers unique access to Google Cloud’s next-generation search engine. Designed for e-commerce, the platform uses AI and machine learning to process 1.8 trillion events and collect 85 billion new events per day from the entire suite of Google products.

With access to this data, GroupBy delivers digital experiences with a deep understanding of user intent. Natarajan noted that its partnership with Google ensures that customers will benefit from any future AI innovations that Google develops.

Incomplete, inaccurate and inconsistent product data can hinder search and discovery, resulting in lost revenue and reduced customer loyalty. Emphasizing the importance of AI in data enrichment, Natarajan cited a 20% increase in e-commerce revenue after optimizing product catalog data for search and discovery.

Revealing lost revenue from faulty data

Technology, or its misuse, can make it difficult for retailers to recognize the existence of bad data. Rethink E-commerce Strategist Vinny O’Brien recounts an example from his earlier days at eBay, presenting an example of how faulty indexing caused continued loss of revenue from suddenly invisible product listings.

It took working with a partner to discover that eBay had failed to normalize any product data. So, for example, if someone searched for Nike shoes, but the product data was missing a capital N in the formatting when the product was uploaded, that product would disappear after the first stage of the search.

This failure was not limited to this one product. This was a systematically recurring result for other sellers on the platform.

“So you just disappeared. You’ve lost about 30% of your search volume. When we finally solved the problem, which was not an easy job in a company of this size, we were getting about 20 to 25% revenue for organizations, especially those that had large catalogs, because we were getting a lot of long, long tail searches and so on. But it’s an area with significant impact,” he described.

The challenges of dealing with bad data in isolation

According to Joyce Mueller, director of retail solutions at Sada, the problem with bad data is an unintended consequence rather than a deliberate effort to deprioritize product data. It’s always been a long-term problem.

Bad data results from incomplete, inaccurate or missing fields. Perhaps the wrong data specifications are supplied or there is inconsistency between SKUs at play, she suggested. Without clean data feeds to tie it all together, we end up with data that isn’t necessarily as complete as we’d like, Mueller continued.

“It was mostly a problem for back-end systems. But now, when product data isn’t complete, accurate, well-described, or in good style and character, it’s actually causing problems for digital shoppers. This will make your product less discoverable,” she warned.

The elusive goal of data standardization

Using the universal standards method is a losing battle. Earlier efforts have not met with universal success.

O’Brien noted that around 2010, all major retail e-commerce platforms made merchants adhere to a standard set of data for each product in order to be visible. Accepting this assumption was only partially a good strategy.

“I think managing the scale of data is a challenge when large companies have these mandates,” he offered. “It has to be accepted by everyone and everyone has to adapt.”

The scope of this data management and governance is huge, he added. Various industries come into play, be it business-to-business or business-to-consumer. Within those verticals could be food applications or medical-type products, he said, noting additional compliance complications.

“Different types of industries also have their own nuances. Managing all of this on a large scale is extremely difficult,” argued O’Brien.

Bridging the gap in data management

Natarajan added that when he talks to dealers or distributors at conferences, he sees a gap between manufacturers and dealers. In the end, it’s a hole that even retailers have to manage, so there are many nuances to navigate.

“Managing this type of data at scale comes with a lot of challenges, which I think is probably why we haven’t seen a level of standardization in product data spread across all different industries, all different verticals and retailers. every size,” he mused.

Sada’s Mueller said she’s not aware of any retail subvertical handling it well. But he sees digital natives doing better simply because it’s new.

“When you think about traditional retailers, they have long-standing systems that don’t necessarily talk to each other. It’s harder for someone with more status to solve these kinds of problems and mold and shape themselves in a way that embraces the new technology. They have a bigger legacy with more technical debt,” she noted.

Some industries may have a better chance of managing their data because the products are less complex. According to Natarajan, you would have fewer product assignments in some of these categories than you would in maybe more technically complex products like machinery and engines and things like that.

“You have this difference in the types of products that will lead to better data management because it’s easier to manage some of these less complex products,” he said.

AI solutions for data enrichment

A panel of experts discussed steps distributors and retailers can take to become more aware of the actions they can take to help overcome the problem of bad data.

Audit product data, starting with the most critical categories.
Implement AI-powered data enrichment and cleansing solutions to improve product data quality.
Measure the impact of data quality improvements on metrics such as revenue, customer satisfaction and returns.
Establish a data management process to ensure consistent and accurate product data going forward.
Explore free trials of AI-powered data enrichment tools and evaluate the impact on your product catalog.
Identify a champion within the organization, potentially from the product merchandising team, to lead the data enrichment initiative.
Modernize data pipelines and consolidate product data into a centralized cloud system that enables more advanced analytics and automation.