See all Blog posts

How EDITED automated retail’s biggest problem with Product Matching

Introducing the new EDITED Product Matching API feature and how it gives retailers unparalleled visibility into their products across various markets and regions.

After one of the toughest trading periods in history, strategic product positioning with the right retail data can make or break your business. 

Thanks to the internet and social media, trends are no longer contained to their respective regions as consumers yearn for an aesthetic inspired from the other side of the world. This means that retailers not only need to turn around the latest global trends but also determine their most optimum product positioning in each respective market. This analysis could cost a company millions of dollars when considering the amount of labor and resources required. So over the past 1.5 years, The EDITED Team set out to solve the complex task of Product Matching – something yet to be available or achieved without manual labor. It can track the places these products are sold or find similar examples of a specific product. 

As the world’s largest retail dataset with over 3.5 billion+ SKUs, the EDITED Retail Intelligence Platform is able to filter and analyze specific markets, retailers, brands, colors, patterns or garment categories. In this report, we illustrate how Product Matching can assist with the problems and constraints the industry is facing, as well as highlight the untapped opportunities retailers can leverage to succeed in today’s retail climate. 

Want to see how your products are performing across various global markets in one place? Reach out for a demo today. 

The problem

Retailers lose millions of dollars every year because of poorly positioned products, pricing behaviors that fall outside of agreements and promotions they have no idea about . The growth of ecommerce has meant that traditional partnerships have moved online. With third-party websites like Zalando, ASOS and Net-a-Porter stocking more brands than ever before, wholesale and pricing teams are facing an increasing battle for visibility. It’s critical for retailers to ensure their pricing is competitive compared to similar products and that their own assortment, wherever it is being sold, is priced consistently. 

Currently, this process is incredibly time consuming for a retailer to do and is virtually impossible to scale. Market analysis of brand positioning can take weeks and by the time the analysis is done, it’s already out of date. Reliance on outdated, manual processes that are entirely dependent on employee resources cost brands and retailers millions of dollars in missed opportunities every year, which is why EDITED has worked hard to solve this. 

What is Product Matching? 

The EDITED Product Matching technology uses machine learning to identify all exact and similar instances of a product in our database. We define an exact product match as one that matches on garment category, color, pattern and brand, but is sold at a different retailer or in a different market. However, a similar product match is a product that has more deviation such as a different color, pattern or brand.

For example, this “Belted Striped Dress” is sold at Farfetch in the UK. Exact matches with other “Striped Belted Dresses” are also sold at Farfetch, but in different markets. This allows the user to see how this product’s price has changed globally. 

product matching

In comparison, this product has similar matches to other “Belted Dresses.” They belong to the same brand (Peserico) and have the same color (neutral), but a different pattern (plain vs. striped). 

product matching

In a database of a hundred billion products, identifying matches isn’t easy. Now imagine trying to teach a machine to do this with accuracy, and without sacrificing value and scale at that. 

product matching

Our first universe

As you can imagine, it was no easy feat. In most cases, there is no common naming convention or SKU across retailers, and images can vary slightly. Manual tests, such as sifting for matched products through our database, had to be done in order to figure out the best way to maximize accuracy for the AI. 

A high level of accuracy was pertinent for our first focus of luxury retail. The industry raked in $45 billion in 2020 alone, so making sure your products are competitively positioned across this market is crucial. The inclusion of high-end luxury products in mass promotions was costing retailers valuable market share, as well as the hearts and minds of their core customer base. Through market research and collaboration with our customers, we created the Luxury Fashion Universe, where it stands at 60 million products, covering 12 markets, over 650 retailers and across women, mens, girls, boys and unisex products). 

With over two years of retail data in the EDITED Product Marketing feature, we utilized the information collected through our trackers (i.e. images, name, description, brand, color, garment category and gender) to match products, thereby creating the Luxury Universe and other analyzed retail markets. Put simply, our team has achieved an incredible feat by teaching a machine to recommend exact or similar product matches for each dedicated retail market, automating this laborious and never before seen process in the industry. 

So how does it really work?

The aim with Product Matching was to use all the comprehensive information amongst the 3.5 billion+ products in our database to teach machines to find matching products. Each field crawled provides extra insight into a product that could lead to matches. When a human looks at two products they can easily determine if they are the same or not. However, if they have slightly different images, SKUs or names, the information included in their care instructions might be the clue the machine needs to find the match. For example, in our eyes these two images are the same product, but to a machine they are very different. 

product matching

To successfully determine what fields help or what good matching models look like, defined metrics are required. So what does a successful match actually mean?

In standard machine learning classification problems, precision and recall are popular metrics when measuring accuracy. But given the scale of EDITED’s database, it’s impossible to look through every single product to be positively sure that we have captured every matching product. So an in-house metric was created to combine precision and recall into one known as “precision over max returned.” This metric rewards models that bring back quality matches, while penalizing models that produced fewer matches than other contending models to account for recall. 

Over the last year and a half, we’ve evaluated hundreds of variations of Product Matching, running different candidates through analytical and in person QAs. The goal was to produce a large scale infrastructure that used a KNN algorithm on our product data to find matches. KNN algorithms look to find the K nearest matches within each universe. These algorithms require raw data to be transformed into vectors. In order to vectorize the data successfully, we broke it down into two groups: text data and image data. This provided the flexibility needed to produce the best results for the different data types and combine at the end. 

product matching

For both our text and image data, dozens of different experiments on how to vectorize were performed specifically using Autoencoders, Transformers, Convolutional Neural Nets (CNN) and text ranking. Choosing the best method of vectorization was not limited to finding what gave the best matching accuracy, but also what was scalable at a reasonable price. Each method presented its own unique challenges though. For example, Autoencoders and TF-IDF required large amounts of data to be pre-trained. 

The final choice of vectorization method and KNN required large scale infrastructure solutions to produce at the scale of EDITED’s database. Here we have utilized Spark’s ability to host large amounts of data, as well as an internal process of crawling data to vectorise products as they came in. The last step in the process was to connect the image and text results back together into one model. This way any product requesting matches gets the best results found from both. So if any product has missing products or tricky data, we are still able to return quality results. 

Future of Product Matching 

The primary teams looking to use a Product Matching function are typically data, wholesale and international teams. These groups look at how their own products are being sold and promoted at a global market scale. They want to understand if their main brand is being under or over indexed by comparison and what is an optimal price. Product Matching opens that door for retailers to react to competitive pricing in real time, in a global marketplace against several competitors at once.

Retail teams who use our Product Matching universes are now able to view assortment, pricing and promotional changes across their global presence like never before. With the ability to pin, group, filter and analyze results while returning to these sets whenever they want gives users the flexibility to tailor what they are looking at to their unique use case. These functionalities and more analysis can be accessed through the EDITED API and app, which is now available for both the activewear and luxury universes – two key markets of interest for retailers. The EDITED goal is not to filter the results too heavily so that retailers are given a wide enough net to fish for the products they need to analyze in order to maintain their competitive advantage.

And that’s only the beginning. Besides our luxury and activewear universes, EDITED has a lot on the horizon for the future of Product Matching to investigate more markets, especially now that retailers can derive insights in real time with this new technology we’ve created.

product matching