NGINX Service Mesh

Article Thursday, June 18 2020

The rise of containers and container orchestration providers has caused a dramatic drop in the price (in terms of both dollars and time) of deploying a fleet of microservices. The added complexity of a microservices architecture has both technical and organizational implications. At its core, the service mesh serves to decouple application code from operational infrastructure. As William Morgan of Buoyant writes:

“In short, the service mesh is less a solution to a technical problem than it is a solution to a socio-technical problem.”

The “socio-technical” problem in question is the separation of ownership between business logic and operational infrastructure. As a microservices-based application expands in scale and scope, the complexity of functions such as security, monitoring, health checks, and management of east-west traffic increases as well. At a certain point, practices such as bundling routing and networking logic into an application via an embedded library become inefficient and can lead to problems such as language lock-in. Additionally, it leads to an overlap of responsibility between operations teams and developers focused on core business logic. In our interview with Alan Murphy of NGINX, he described some of the issues with overlapping responsibilities that faced a developer team considering adopting a service mesh:

“The downside beyond just having that functionality different in that workflow difference was that they didn’t control that library. It was actually built by another team. If they happened to introduce something different in their application that then broke that library interaction, that was only discovered during QE or E-to-E testing. They then had to go back to the team that owned that library and say, ‘We have this problem. Can you help us? Can you help us debug it? Can you tell us why it’s not working? Why is our intercommunication no longer functioning?’”

Embedded Library Approach

The “Sidecar Proxy” model is one solution to this tight coupling issue, and it forms a critical piece of the service mesh architecture as a whole. Rather than embedding a library in the application, traffic will be routed through a separate proxy. In addition to “north-south” traffic between a server and a client, microservices architectures rely on “east-west” network traffic between services. The sidecar proxy runs in the same Kubernetes pod as the service, while acting as a gatekeeper for east-west network traffic from other services. Alan Murphy described the benefits of the sidecar proxy:

“The benefit there is twofold; one is that you have a single point for policy enforcement, security enforcement, that’s the sidecar. The second benefit is that you no longer have to code any of that into the app stack or the app layer. Your application developer doesn’t need to know what an ACL is, they don’t need to worry about SSL exchange, or having the right client certificate. It takes that burden off the app and puts it back in the sidecar. That’s the real value of a sidecar and why it’s so important in east-west traffic management, specifically in a service mesh.”

Sidecar Proxy Approach

Developers looking to implement the sidecar proxy approach face a tradeoff with latency- the additional network “hop” will not be quite as fast as an internal call made to an embedded library. Therefore it is important when building a service mesh that the chosen proxy is performant and lightweight.

NGINX is a web server used by over 445 million websites worldwide, ranking first in several measures of web server adoption. NGINX was developed in 2004 as open-source software with an emphasis on being lightweight, scalable, and high-performance. NGINX uses an asynchronous, event-driven architecture that optimizes performance and efficiency while handling high concurrency.

NGINX also found widespread use as a load balancer, API gateway, and reverse proxy. With the rise of containerized applications, developers found that while some containers may not need a web server in the traditional sense, in many cases it made sense to add a lightweight web server to provide logging, security, and monitoring functions. NGINX, being lightweight by design, was well-suited for this task- NGINX is the most commonly deployed technology in Docker containers.

The principal use case for NGINX in a service mesh architecture has been as a sidecar proxy, typically with another service such as Istio serving as the control plane. NGINX developed an open-source project called nginmesh, an implementation of NGINX as the sidecar proxy in the data plane paired with an Istio control plane.

Since the acquisition of NGINX by F5, NGINX has developed a Service Mesh Module as part of its NGINX Controller Application Platform. NGINX Controller acts as a managed control plane for application development with a focus on microservices, and the Service Mesh Module shares a common GUI with the existing API Management Model. The most recent version, NGINX Controller 3.0, takes what F5 and NGINX call an “app-centric approach to managing and delivering apps and APIs”, which consolidates development and infrastructure functions around a single platform. Kevin Jones of NGINX described the way that NGINX Controller works:

“…the idea is that you have one API management plane that’s handling the policy and the configuration for how those applications should be managed and accessed, but then the actual routing and the load-balancing and authentication and authorization is usually handled at the API gateway layer.”

This relationship between the NGINX controller and NGINX API gateway layers is congruent with the use of controller and data planes in the service mesh, a connection Jones makes explicit:

“This is where as you start to even get more and more proxy configurations, you start building the need for something like a service mesh, and this is where your control plane really pays a lot of value because it’s going to help you with configuration. It’s going to help you with monitoring. It’s going to help you with management and give you that visibility into the entire infrastructure.”

With a wide array of options for building service mesh control and data planes, integrating up the stack using NGINX could be a logical choice for an organization that wishes to reduce the burden of configuring several different types of services, and take advantage of NGINX’s core competencies:

“From us, it’s very, very critical to have a very homogeneous control plane and data plane, because we can take advantage of so much that’s already there in NGINX. When we talk about control plane functionality, we’re talking about how does the admin interact with the mesh, how does the mesh inject sidecars, how does it manage and consume policy and push that policy down to the sidecar for consumption, that’s where we can start talking about really, really cool opportunities to have an NGINX control plane, NGINX data plane, very, very tightly coupled, very, very efficient and small, small footprint. That’s our thinking in NGINX world about whether you should use a data plane, sidecar off the shelf, or use one that you can truly understand and integrate with natively.”

-Alan Murphy, NGINX

Despite the hype, the adoption of service meshes in production has been slow. The principal concern in the development community centers around the tradeoff between the benefits of a service mesh and the complexity of its implementation.

The hesitance of organizations to adopt service meshes until a dominant market leader emerges bears similarities to the container orchestration wars. Then, as now, the appearance of daunting complexity makes investment in one stack or another a risky proposition while the competing technologies evolve and compete for a small but growing market share of early adopters.

The uncertainty surrounding service meshes is important for understanding NGINX’s value proposition. As Alan Murphy of NGINX notes:

“It’s that we know how to build an extremely efficient reverse proxy. We know how to embed some extremely sophisticated routing mechanism, security policies…we were designed from the ground up to do all that in software only as a containerized appliance or containerized function. Running as a sidecar to us is just as native as running on bare metal on an 88 core del box in front of the data center. It’s just as native as running as an AWS AMI, just as native is running your own VM. We have that market support, we have that market trust, but we also have the design pedigree of knowing how to write really efficient software in a tiny, tiny footprint.”

NGINX and F5 are both well-established in the space, and their reputation for quality and reliability may provide reassurance for skeptical organizations to “get off the bench.” Given its widespread usage in web servers and reverse proxies, it is likely that many organizations considering adopting a service mesh architecture already use NGINX in some fashion, and NGINX has fifteen years of industry adoption under its belt. The release of a managed environment in NGINX Controller 3.0 may also help organizations that require a more comprehensive set of managed services built around a service mesh infrastructure transition more efficiently.

For more on NGINX Service Meshes, check out our interview with Alan Murphy on the topic, or visit NGINX’s website to read their posts on service meshes and microservices. For more about service meshes in general, check out our past articles and podcasts on the subject.