Shaping Chick-fil-A One Traffic in a Multi-Region Active-Active Architecture

This post was originally written by  Christopher LaneJamey Hammock, & Brian Chambers on Medium. Reposted with permission.


In this post, we will share about the architecture behind north-south routing and traffic shaping in the API tier of our Chick-fil-A One mobile application.

Scaling Chick-fil-A One

Chick-fil-A’s mobile application continues to be one of the the industry’s favorites. In addition to regular mobile orders, our traffic is often driven by national promotions or product launches that result in huge traffic spikes.

During our most recent campaign, we saw upwards of 2,000 transactions per second flowing through our API tier.

Traffic to Chick-fil-A One during national promotion

It is very important to us, and to our customers, to keep a very high level of availability for the numerous microservices that support the Chick-fil-A One Application, so we have taken steps to move towards a multi-region active-active architecture. On our microservices journey, we are also shifting towards leveraging containers, with many services being deployed in Kubernetes clusters in Amazon EKS.

At AWS re:Invent 2018, we shared the approaches we have taken so far— and will take in the future — to deploy these multi-region, active-active APIs in AWS.

Ingress Control Architecture

Our team is using the Kubernetes ALB Ingress Controller (developed by AWS and Ticketmaster) in combination with Ambassador(an open source technology from Datawire) for north-south routing into our Amazon EKS clusters.

The ALB Ingress Controller is a Kubernetes ingress controller that manages AWS Application Load Balancers (ALB). Ambassador is an open source, Kubernetes-native API Gateway built on the Envoy proxy developed at Lyft.

There is some overlap between the two projects, but we wanted the best of both worlds. The ALBs provide benefits like the AWS Web Application Firewall (WAF) and SSL termination. Envoy provides service routing, authentication, canary deployments, rate limiting, and transparent monitoring of L7 traffic to our services.

Ambassador’s role is to provide a simple, decentralized way to manage Envoy proxies via annotations on Kubernetes service manifest files. This allows service developers to declare the entire state of the routing tier in code and enables tight integration with our DevOps pipelines.

We are pretty excited about connecting these two projects together!

The Best of Both Worlds

We reached out to Datawire to discuss our approach. They suggested we make Ambassador a Kubernetes service type of “ClusterIP” and have the ALB route traffic to Ambassador. Assuming two services running in a cluster, Service A and B, the architecture on Amazon EKS looks as follows:

Requests to /a/* will go through the ALB to the Ambassador service, which will then select the A pods running on our EKS nodes:

Requests to /b/* get routed in a similar manner:

We use Amazon Route53 to do region health checking and geography-based prioritization when routing traffic to our service regions. Region affinity is based upon of a user’s token, which obtained at login out-of-band earlier in the flow, and applies for the lifetime of their session.

We’re also using ExternalDNS to configure AWS Route53, CoreDNS for DNS and service discovery, fluentd-cloudwatch to ship logs to CloudWatch and Prometheus Operator for monitoring time-series metrics.

We manage Kubernetes deployments with Vessel — a homegrown GitOps tool that we’ll be open sourcing shortly.

Deploying the Solution

Here are the steps we took to make the integration between the ALB Ingress Controller and Ambassador happen.

The first step in deploying this type of solution is to deploy the ALB Ingress Controller:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: alb-ingress-controller
  labels:
    app: alb-ingress-controller
rules:
  - apiGroups:
    - ""
    - extensions
  resources:
    - configmaps
    - endpoints
    - events
    - ingresses
    - ingresses/status
    - services
  verbs:
    - create
    - get
    - list
    - update
    - watch
    - patch
  - apiGroups:
    - ""
    - extensions
  resources:
    - nodes
    - pods
    - secrets
    - services
    - namespaces
  verbs:
    - get
    - list
    - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: alb-ingress-controller
  labels:
    app: alb-ingress-controller
roleRef:
  name: alb-ingress-controller
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
subjects:
- kind: ServiceAccount
  name: alb-ingress
  namespace: default
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: alb-ingress
  namespace: default
  labels:
    app: alb-ingress-controller
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    name: alb-ingress-controller
    namespace: default
    app: alb-ingress-controller
spec:
  replicas: 1
  selector:
    matchLabels:
      app: alb-ingress-controller
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: alb-ingress-controller
    spec:
      containers:
        - args:
            - /server
            - --watch-namespace=default
            - --ingress-class=alb
            - --cluster-name=<CLUSTER NAME>
          env:
            - name: AWS_REGION
              value: <AWS REGION>
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
          image: quay.io/coreos/alb-ingress-controller:1.0
          imagePullPolicy: Always
          name: server
          resources: {}
          terminationMessagePath: /dev/termination-log
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      securityContext: {}
      terminationGracePeriodSeconds: 30
      serviceAccountName: alb-ingress
      serviceAccount: alb-ingress

Next, deploy Ambassador following the directions in the Ambassador Getting Started guide (RBAC is enabled in our cluster):

kubectl apply -f https://getambassador.io/yaml/ambassador/ambassador-rbac.yaml

Then, deploy the Ambassador ClusterIP service:

apiVersion: v1
kind: Service
metadata:
  name: ambassador
  namespace: default
  annotations:
    getambassador.io/config: |
      ---
      apiVersion: ambassador/v0
      kind:  Module
      name:  ambassador
      config:
        diagnostics:
          enabled: false
spec:
  type: ClusterIP
  ports:
  - port: 80
    protocol: TCP
    name: http
  selector:
    service: ambassador

Finally, route all traffic from the ALB to the Ambassador service:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ambassador
  namespace: default
  annotations:
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTP":80,"HTTPS": 443}]'
    alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:<REGION>:xxxxx:certificate/xxxxxxx
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/subnets: subnet-xxxx,subnet-xxxx,subnet-xxxx
spec:
  rules:
    - host: <HOSTNAME>
      http:
        paths:
          - path: /*
            backend:
              serviceName: ambassador
              servicePort: 80

We are now ready to create our service mappings in Ambassador! For example, the Ambassador annotations for Service A in our above architecture might look like:

apiVersion: v1
kind: Service
metadata:
  name: service_a
  annotations:
    getambassador.io/config: |
      ---
      apiVersion: ambassador/v0
      kind: Mapping
      name: service_a_mapping
      prefix: /a/
      service: service_a:80
spec:
  ports:
  - name: service_a
    port: 80

Software Daily

Software Daily

 
Subscribe to Software Daily, a curated newsletter featuring the best and newest from the software engineering community.