Federated Learning on the Edge May Out-Compete the Cloud on Privacy, Speed, and Cost

Machine Learning (ML), a subset of Artificial Intelligence, is an approach that draws patterns from data to drive improved performance on a task at hand. For example, TikTok leverages ML to figure out what content to recommend to its users to provide a perfectly curated experience— do you engage with sports videos but skip country music? The Apple Watch runs on-device ML as it uses data from the gyroscope, accelerometer, and optical heart sensor to classify your movement (or lack thereof) to detect when you are playing tennis or sleeping.

Federated Learning (FL) on the edge is a machine learning approach that takes it a step further to use elegant architecture to further improve model accuracy, prioritize data privacy, and reduce unnecessary network bottlenecks compared to traditional cloud-centric machine learning deployments.

So what are the fundamentals of FL and which industries are poised to offer their users a better experience by investing in this technology in 2023? Let’s start by taking a look back.

The (inadequate) status quo!

In the 2000s, the “cloud” began to take off. Programmers and businesses started to procure virtual compute resources in an on-demand fashion to run their software and applications. Over the last two decades, developers have grown accustomed to and reliant on instantly available infrastructure that is managed and maintained by someone else. And this is no surprise. Abstracting hardware and infrastructure away enables developers and companies to focus on product innovation and user features above all else.

Amazon Web Services, Microsoft Azure, and Google Cloud Platform have made storage and compute ubiquitous, on-demand, and straightforward to deploy. And these hyperscalers have built robust, high-margin businesses atop this approach. Amazon alone generated $19 billion in revenue last quarter from AWS. Organizations reliant on the cloud have traded capital expenditures (servers and hardware) for operating expenditures (pay-as-you-go compute and storage resources).

Although cloud’s ease-of-use is a boon to any upstart team trying to innovate at all costs, cloud-centric architecture is a significant cost-of-revenue as a company scales. In fact, 50% of large SaaS company revenue goes towards cloud infrastructure! As machine learning continues to grow in popularity and utility, organizations store an increasing amount of data in the cloud and train larger and larger models in search of higher model accuracy and greater user benefit. This further exacerbates the reliance on cloud providers and companies find it difficult to repatriate workloads to on-prem solutions. In fact, doing so would require them to hire a stellar infrastructure team and re-architect their systems altogether.

Organizations are looking for solutions that enable new product innovation and offer high accuracy with low latency while still being cost-effective – enter Federated Learning on the edge.

What is Federated Learning (FL) on the edge?

Federated Learning, or collaborative learning as it’s also called, takes a different approach to data storage and compute. Whereas popular cloud-centric machine learning approaches send data from your phone, for example, to centralized servers and aggregates this data in a silo, FL on the edge keeps data on the device (e.g. your mobile phone or your tablet). It works in the following way:

Step 1: Your edge device (or mobile phone) downloads an initial model from a FL server

Step 2: You then conduct on-device training where the data on your device improves the model

Step 3: The encrypted training results are sent back to the server for model improvement while the underlying data sits safely on the user’s device.

Step 4: With the model on the device, you conduct training and inference on the edge in a completely distributed and decentralized way.

This loop continues iteratively and your model accuracy increases.

The benefits for the user

When you aren’t reliant or bottlenecked by the centralization of data, the user benefits in dramatic ways. With Federated Learning on the edge, developers can improve latency, reduce network calls, and drive power efficiency all while promoting user privacy and improved model accuracy. FL on the edge is enabled by the ever-increasing hardware capability of the phones in our pockets. Each year, on-device computation and battery life improves. As the smartphone processor and hardware in our pocket improves, FL techniques will unlock increasingly complex and personalized use cases. Imagine, for example, software that sits on your phone in a privacy centric way that can automatically draft replies to incoming emails with your individual tone, punctuation style, slang, and other hyper-personalized attributes — all you have to do is click send.

Enterprise pull is strong

In my conversations with multiple Fortune 500 companies, it has been blindingly obvious how much demand there is for FL on the edge across sectors. Chief Technology Officers express how they’ve been searching for a solution to bring FL techniques on the edge to life. Chief Financial Officers reference the millions of dollars spent on infrastructure and model deployment that could otherwise be saved in an FL approach. In my opinion, the three industries that have the most potential to reap the rewards from FL are finance, media and e-commerce. Let me explain why.

Use Case #1: Finance - improved latency and security

For example, many large multinational financial companies (e.g. Mastercard, Paypal) are eager to adopt FL on the edge to assist them with identifying account takeovers, money laundering, and fraud detection. More accurate models are sitting on the shelf and have not been approved for launch. Why? These models increase latency just enough that the user experience is negatively impacted – we can all think of apps we no longer use because they took too long to open or crashed and companies can’t afford to lose users for these reasons. Instead, they accept a higher false negative rate and suffer excess account highjacking, laundering, and fraud. FL on the edge empowers companies to simultaneously improve latency to less than ten milliseconds while showing a 50% relative uplift in model performance compared to traditional cloud-centric deployments.

Use Case #2: Media - hyper personalization

In the media sector, companies like Netflix and Youtube want to increase their suggestion relevancy on what movies or videos to watch. The Netflix Prize famously awarded $1M for a 10% uplift in performance compared to its own algorithm. FL on the edge has the potential to offer similar impact. Today, when a new show is launched or a popular sporting event is live (like the Superbowl), companies reduce the signals they gather from their users. Otherwise, the sheer volume of data (at a rate of millions of requests per second) causes a network bottleneck which prevents them from recommending content at scale. With edge computing, companies can leverage these signals to suggest personalized content based on learning from individual user’s tastes and preferences.

Use Case #3: E-Commerce - more timely and relevant suggestions for consumers

Lastly, e-commerce and marketplace companies want to increase click-through rates (CTR) and drive conversions based on real-time feature stores. This enables them to re-rank recommendations for customers and serve more accurate predictions without the lag of traditional cloud-based, recommendations. Imagine, for example, opening the Target application on your phone and getting highly personalized recommendations for products in a completely privacy-centric way – no identifying data would have left your phone. Federated Learning has shown a double digit percentage increase in CTR and in purchases thanks to a more performant, privacy-aware model that offers users more timely and relevant suggestions.

The Market Landscape

Due to technological advances, large corporations and start-ups alike are working to make federated learning more ubiquitous so that companies and consumers alike can benefit -- for companies, this likely means lower costs while for consumers it may mean a better user experience. There are already a few early players in the space: Amazon SageMaker allows developers to deploy ML models primarily on edge-devices and embedded systems; Google Distributed Cloud Edge extends their GCP infrastructure to the edge; and NimbleEdge, an upstart company, is reimagining the infrastructure stack altogether. NimbleEdge specifically offers FL on the edge and enables companies to migrate mobile applications and websites to on-device compute. They have already brought applications and hyper personalized ML models on millions of devices to take advantage of the benefits of FL on the edge.

While we are in the early innings, Federated Learning on the edge is here and the hyperscalers are in an incumbent's dilemma. The revenue that cloud providers earn for compute, storage, and data is at risk; modern vendors who have adopted edge computing architecture can offer customers premium ML model accuracy and reduced latency. This improves user experience and drives profitability – a value proposition that you cannot ignore for long.