ja

MandM Direct: Managing Models at Scale with Dataiku + GCP

MandM Direct uses Dataiku and Google Cloud Platform (GCP) to build and maintain a data science practice that operationalizes 10x more models versus a code-only approach, supporting a wide range of use cases, including personalized marketing campaigns.

Organizations are moving from experimenting with machine learning to scaling it in production environments, but one difficulty is maintenance. How can companies go from managing just one model to managing tens, hundreds, or even thousands?

“As the machine learning community continues to accumulate years of experience with live systems, a wide-spread and uncomfortable trend has emerged: developing and deploying ML systems is relatively fast and cheap, but maintaining them over time is difficult and expensive.”

— Google, Hidden Technical Debt in Machine Learning Systems

This is exactly the challenge faced by MandM Direct, one of the largest online retailers in the United Kingdom with over 3.5 million active customers and seven dedicated local market websites across Europe. The company delivers more than 300 brands annually to 25+ countries worldwide, which means in 2020, they grew fast. Their accelerated growth meant more customers and, therefore more data, which magnified some of their challenges and pushed them to find more scalable solutions.

Challenges

MandM Direct’s rapid growth resulted in two big challenges:

  1. Getting all the available data out of silos and into a unified, analytics-ready environment: The core data team is made up of four people (two data scientists, one senior analyst, and one data analyst), but they extend their reach by leveraging a hub and spoke model for their data center of excellence, meaning they work with analysts embedded across the business lines to scale their efforts. However, this requires an easy way to enable those teams to leverage data to answer business questions that doesn’t necessarily involve code.
  2. Scaling out AI deployment in a traceable, transparent, and collaborative manner: MandM’s first machine learning models were written in Python (.py files) and run on the data scientist’s local machine, and they needed a way to prevent interruptions or failure of the machine learning deployments.

In an attempt to tackle the second challenge, the team moved these .py files to Google Cloud Platform (GCP), and the outcome was well received by the business and technical teams in the organization. However, once the number of models in production went from one to three and more, the team quickly realized the burden involved in maintaining models. There were too many disconnected datasets and Python files running on the virtual machine, and the team had no way to check or stop the machine learning pipeline. They needed another solution.

動画を視る
“Having a platform like Dataiku allows our data scientists to focus on building cool things, not spending hours and hours on maintenance and making sure things are running. With workflows deployed in Dataiku, we save literally days of work every month.” Ben Powis Head of Data Science at MandM Direct

The Solution: Dataiku + GCP

MandM turned to the powerful combination of Dataiku and GCP to answer their two critical yet unique challenges. With Google BigQuery’s fully-managed, serverless data warehouse, MandM could break the data silos and democratize data access across teams. MandM Direct was one of the first online retailers to implement Google BigQuery across the organization.

At the same time, thanks to Dataiku’s visual and collaborative interface for data pipelining, data preparation, model training, and MLOps, MandM could also easily scale out their models in production without failure or interruptions in a transparent and traceable way.

MandM now has hundreds of live models, all with visibility into model performance metrics, clear separation of design and production environments, and many more MLOps capabilities built into the platform.

Teams can now easily push-down and offload computations for both data preparation and machine learning to GCP. Using Dataiku means this capability is accessible to all user profiles across MandM, without knowing the underlying technologies or complexity.

Results, Impact, and What’s Next

The benefits MandM have seen by using Dataiku and GCP aren’t limited to time saved from tedious maintenance work — they are also having more impact across the business. The data team is now able to deliver a variety of business solutions on business problems from adtech to customer lifetime value, whether that’s a dashboard, a more detailed piece of analysis or a machine learning project deployed in production.

“Broadly, we love Dataiku. We do have a mix of people that go more toward AutoML and visual tools as well as one data scientist who loves to work in code. But that’s the beauty of Dataiku and why we chose it — we didn’t want a low-code tool where we could get lazy and just click a few buttons. Now the team has the flexibility: if they want to nerd out and go under the hood, they can do that. If they need a quick model, they can do that too.”

— Ben Powis, Head of Data Science at MandM Direct

For example, one application might be business users in the buying and merchandising teams, who could interact with machine learning models in their day-to-day work through Dataiku applications, which provide a nontechnical interface for projects developed by the data team.

The team is also particularly proud of the work they’ve done to build out a feature library with Dataiku that contains more than 400 features specific to MandM’s business. Now, the feature library is the first place people go, sort of like a shop window for machine learning projects  — it takes away the monotony and repetition of their work.

Showroomprivé: Putting ML-Powered Targeting in the Hands of Marketers

Showroomprivé leverages Dataiku to innovate across their business, including for machine learning-based targeting to build marketing campaigns that are 2.5x more effective.

Read more

Go Further

Industry Analyst and Customer Recognition for Dataiku

Don't just take our word for it — see what industry analysts around the world say about Dataiku, the leading platform for Everyday AI.

Learn More

Achieving Everyday AI

Everyday AI is about making the use of data almost pedestrian. AI that is so ingrained and intertwined with the workings of the day-to-day that it’s just part of the business (not only being used or developed by one central team).

Learn More

LVMH: Centralization & Personalization — A Hybrid Approach to AI

Discover how LVMH centralized and customized deployment of AI algorithms for its luxury goods houses.

Learn More
動画を視る
Video

Beyond AI Adoption: Establishing a Lasting Data Culture ft. Mercado Libre

Hear how online marketplace company Mercado Libre uses data literacy and enablement programs to create and maintain a data-driven culture.

Learn More