Edge Computing and the Future of the Cloud
“Edge” and “fog” are the new buzzwords we keep hearing about. What is edge computing, and what applications does it have? Well, to talk about these, we have to understand how it came about. Let’s start with a brief history lesson.
Computing infrastructure went through several alterations between centralized and decentralized architectures in the past decades. To make a little introduction to these architectures, centralized computing involves a central computer, as the name suggests, and multiple other machines can access this computer via terminals. On the other hand, decentralized computing involves multiple stand-alone computers or machines, communicating with each other through various protocols.
In the 1950s, at the beginning of commercial computing, centralized computing became prominent with large, expensive mainframe computer systems. Since this mainframe era, computing went through two cycles of centralization and decentralization by 1997. By then, desktop computers at homes had become wide-spread with cheap commodity hardware. This was the last decentralization step, followed by the ever-so-famous centralized cloud architecture, with huge data centers performing basically every important task. But now, edge computing is on the door, knocking, for the next shift.
Cloud is the current centralized paradigm. Almost all web content is served through a major data center. Researchers rent their private servers from cloud to test their models and make their experiments. Enterprises perform their business logic in remote servers. Cloud offers a convenient way for businesses, small or big, to get computing sources over a service provider instead of building their own data center for use. Usage of cloud services can be seen everywhere. AWS, DigitalOcean, Azure, Google Cloud, VMWare have become names that every developer knows. But, the winds are changing. What’s causing the new trend towards a higher emphasis on edge devices and edge computing? Well, what actually is edge?
Defining Edge Computing
On Edge Deep Learning with Aran Khanna, they made a pretty compact definition of edge devices. “An edge device is essentially any device that sits outside of the data center.” Edge computing, as a result, “is a new paradigm in which substantial compute and storage resources are placed at the edge of the Internet, in close proximity to mobile devices or sensors (source)”. An edge device might be the mobile phone you might be using right now. It can be security cameras all around the streets and banks. They don’t have to be small either, self-driving cars are also considered edge devices. As Peter Levine said on one of our podcast episodes, with these devices real-time and real-world information gathering is easier, and is getting more wide-spread with the proliferation of IoT devices. Furthermore, many of these devices generate real-world data such as images and videos. One of the biggest issues with the abundance of edge devices is the sheer amount of data that is being generated. A surveillance camera with a frame rate of 10 Hz can generate over 250MB of data per second. A plane connected to the Internet generates 5TB of data per day. A self-driving car generates 4TB of data per day. And these are just for individual devices. Imagine the amount of traffic if multiple of these devices were trying to send their raw data over to a centralized server. Problems of bandwidth and latency for devices that require near real-time processing make the cloud structure infeasible.
Another concern is energy consumption. David Brooks, a computer architect in Harvard SEAS, points out that the energy consumption for transferring a single bit over the Internet is 500 microjoules. According to his calculations, in 2015, with a monthly cellular usage of 3.7 exabytes and thus 500 terawatt hours, 2% of the world’s energy consumption was used on cellular data transfer. This energy consumption can be greatly reduced with edge computing.
An important consideration when using cloud services is privacy and security. Data breaches are becoming more common, and organizations such as hospitals that must ensure user privacy cannot send raw data directly to their cloud services. There has to be a preprocessing involved, on the edge level.
In comes edge computing. In this architecture, computing gets physically closer to devices, either in the form of the device performing the computation itself, or by deploying a cloudlet close to the device that acts as a miniature cloud, or a combination of both. This middle layer consisting of cloudlets is also sometimes called “fog”, and these cloudlets are sometimes called “fog nodes”. This cloud-fog-edge architecture provides numerous benefits. The four main advantages are lower latency, lower cost with edge analytics, privacy-policy enforcement, and increased reliability (source).
Edge devices and cloudlets are physically closer, often one hop away, compared to the cloud, to which usually several hops have to be made. Cloudlets can even be connected to the edge devices through a wired connection. This provides lower latency and higher bandwidth, since a fog node connects to a much fewer number of devices compared to a central cloud. Managing the data in the cloudlets provides lower response time.
The data collected by edge devices are enormous, especially by high-data-rate devices. Sending all this data to the cloud for analysis and inference can hog up precious bandwidth, and can be impossible in many cases. Performing preprocessing such as sampling and blanking right at the edge device or at a cloudlet can greatly reduce the amount of data being transferred, and allows structured data to be directly sent to the cloud for storage or further processing. Thus, a lower cost with less bandwidth usage and less energy usage is achieved. This preprocessing on a local node can also ensure the necessary privacy policies to be enforced, such as redacting sensitive and identifiable information from hospital reports and blurring the faces from a camera.
Where there is a vast amount of data, there is machine learning. As such, edge devices and edge computing have a close relationship with machine learning. For example, a surveillance camera constantly generates images of the area it’s covering. This camera might utilize deep learning to recognize a specific object, perhaps a human or a car, and might move to keep the surveillance intact. A self-driving car needs to calculate its next action according to the data it receives from its sensors and cameras, and all these factors must be run through a machine learning model for a proper inference. An interesting use case is the work of Chang et. al. on using machine learning for network edge caching. They claim that through accurately profiling user behaviors in an area, getting data from their edge devices such as mobile phones and personal computers, and constructing clusters with unsupervised learning on the data, appropriate content can be proactively cached either in users’ edge devices or in a cloudlet near the area to achieve low latency and to preserve energy. As IoT is getting more creative and seeps deeper into our daily lives, the possibilities of machine learning applications on edge are endless.
Complications Between Edge Computing and Machine Learning
Edge devices, compared to cloud servers or cloudlets, have much smaller memory and less computing power. Yet, most of these devices have to make near real-time decisions based on the input they gather. It’s not possible for most of these devices to hold the data that they generate, and use the data to build a machine learning model. How inference was handled in most applications in the past years is offloading the data to the cloud, and running the input through the model in the cloud, and getting a response back to the edge device. However, this is not an ideal solution for applications requiring near real-time response.
In such cases, there are several options with edge computing:
The first option is bringing the model to the edge devices. Mobile devices such as smartphones have enough computing power and memory to perform inference if the model is trained in a remote machine and deployed onto the edge device. In such cases, collected data can be run through the deployed model, and near real-time inference can be achieved. This data can then be sent to the cloud, and through transfer learning, can be integrated into the existing model, improving its accuracy. Sounds all good, right? Not quite.
These models themselves can be quite large, especially for deep learning models with typically thousands of parameters and weight values of filters. This size can make it impossible for a device to hold the trained model, or it can hinder the device’s capability of generating and holding onto the generated data. However, there are a few tricks that can be performed. A model trained on the cloud, that takes up more space than can be allocated on the device, can be slimmed down, using quantization. What quantization basically does is reducing the size of the weights in a machine learning model, from 32-bit floating point weights to 16 or 8-bit precision. This not only helps with slimming the model down, but also allows for faster inference without creating different pipelines and different models for edge devices. A small downside of this approach is slightly reduced accuracy, but in applications where faster inference is more desirable than perfect accuracy, this might be very beneficial.
Another approach to having faster inference in devices is through limiting the capabilities of the models to the very specific subdomain that serves the task the edge device should perform. A perfect example of this is Nikouei et. al.’s Lightweight CNN (L-CNN). Their aim is to allow more intelligence at the edge device to make faster and better inference. In this case, the edge device is a surveillance camera that must detect humans. I will not go into the details of L-CNN, however, L-CNN manages a high performance without sacrificing much accuracy by specializing the established detection methods for human detection to reduce unnecessary filters. L-CNN achieves more efficient use of the resources the device. So, with certain constraints in mind, machine learning can be integrated into edge computing for faster and lower cost inference abilities.
L-CNN is developed for smarter edge devices such as a Raspberry Pi, with a memory of close to 1GB. This assumption is acceptable, since dealing with image data requires heavy resources. However, it is possible to develop algorithms for edge devices with much more scarce resources, as is evident with Kumar et. al.’s Bonsai tree algorithm for resource-efficient machine learning. Bonsai is suitable for much smaller devices that usually act as sensors in medical, industrial, or agricultural applications, with a model size of less than 2KB for different applications and data sets, achieved by a “novel model based on a single, shallow, sparse tree learned in a low-dimensional space”. Deploying models on the edge device, instead of getting inferences from the cloud also has the added advantage of offline functionality.
Future Challenges for Edge Computing
A quintessential benefit of the cloud era is centralization and the ease of management it brings. As computing becomes distributed across many nodes and devices, management problems arise. Also, centralized systems are easier to provide security for. As the number of devices in a network increase, it gets harder to ensure security in the network. Physical tampering with the devices become possible. Interjecting the data transfer between a device and a cloudlet, or between a cloudlet and the central cloud becomes possible. These challenges are already being tackled, so the path in front of edge seems to be getting clearer.
In closing, the cloud will certainly not disappear. We still need data centers for many other use cases that require heavy computation, and for storage. However, until the next big shift, the edge is the future.