How to Capture Packets that Don't Exist

How to Capture Packets that Don’t Exist

One of my favorite networking tools is Wireshark; it shows you a packet-by-packet view of what’s going on in your network. Wireshark’s packet capture view is the lowest level and most extensive you can get before you have to bust out the oscilloscope. This practice is well-established in the pre-Kubernetes world, but it has some challenges if you’re moving to a Cloud Native environment. If you are using or moving to Cloud Native, you’re going to want to use packet-level tools and techniques in any environment, so that’s why we built Aspen Mesh Packet Inspector. It’s designed to address these challenges across various environments, so you can more easily see what’s going on in your network without the complexity.  

Let me explain the challenges facing our users moving into a Kubernetes world. It’s important to note that there are two parts to a troubleshooting session based on packet capture: actually capturing the packets, and then loading them into your favorite tool. Aspen Mesh Packet Inspector enables users to capture packets even in Kubernetes. That’s the part you need to address to power all the existing tools you probably already have.  Leveraging the existing tools is important as our customers have invested heavily in them.  And not just monetarily – their reputation for reliable apps, services and networks depend on the reliability and usefulness of the tools, procedures and experience powered by a packet view. 

What’s so hard about capturing these packets in modern app architectures on Kubernetes? The two biggest challenges are that these packets may never be actual packets that go through a switch, and even if they were, they’d be encrypted and useless. 

Outside of the Kubernetes world, there are many different approaches to capture packets. You can capture packets right on your PC to debug a local issue. For serious network debugging, you’re usually capturing packets directly on networking hardware, like a monitor port on a switch, or dedicated packet taps or brokers. But in Kubernetes, some traffic will never hit a dedicated switch or tap. Kubernetes is used to schedule multiple containers onto the same physical or virtual machine. If one container wants to talk to another container that happens to be on the same machine, then the packets exchanged between them are virtual – they're just bytes in RAM that the operating system shuffles between containers. 

There’s no guarantee that the two containers that you care about will be scheduled onto the same machine, and there’s no guarantee that they won’t beIn fact, if you know two containers are going to want to talk to each other a lot, it’s a good idea to encourage scheduling on the same node for performance: these virtual packets don’t consume any capacity on your switch and advanced techniques can accelerate container-to-container traffic inside a machine. 

Customers that stake their reputation on reliability don’t like mixing “critical tool” and “no guarantee”.  They need to capture traffic right at the edge of the container. That’s what Aspen Mesh Packet Inspector does. It’s built into Carrier-Grade Aspen Mesh, a service mesh purpose built for these critical applications. 

There’s still a problem though – if you are building apps on Kubernetes, you should be encrypting traffic between pods. It’s a best practice that is also required by various standards including those behind 5G.  In the past, capture tools have relied on access to the encryption key to show the decrypted info. New encryption like TLS1.3 has a feature called “forward secrecy” that impedes this. Forward secrecy means every connection is protected with its own temporary key that was securely created by the client and the server – if your tool wasn’t in-the-middle when this key was generated, it’s too late. Access to the server’s encryption key later won't work. 

One approach is to force a broker or tap into the middle for all connections. But that means you need a powerful (i.e. expensive) broker, and it’s a single-point-of-failure. Worse, it’s a security single-point-of-failure: everything in the network has to trust it to get in the middle of all conversations. 

Our users already have something better suited – an Aspen Mesh sidecar (built on Envoy). They’re already using a sidecar next to each container to offload encryption using strong techniques like mutual TLS with forward secrecy. Each sidecar has only one security identity for the particular app container it is protecting, so sidecars can safely authenticate each other without any trusted-box-in-the-middle games. 

That’s the second key part of Aspen Mesh Packet Inspector – because Aspen Mesh is where the plaintext-to-encrypted operation happens (right before leaving the Kubernetes pod), we can record the plaintext. We capture the plaintext and slice it into virtual packets (in a standard “pcap” format). When we feed it to a capture system like a packet broker, we use mutual TLS to protect the captured data.  Our users combine this with a secure packet broker, and get to see the plaintext that was safely and securely transported all the way from the container edge to their screen. 

If you’re a service provider operating Kubernetes at scale, packet tapping capabilities are critical for you to be able to operate the networks effectively, securely and within regulatory and compliance standards. Aspen Mesh Packet Inspector provides the missing link in Kubernetes, providing full packet visibility for troubleshooting and meeting lawful intercept requirements.  


Doubling Down On Istio

Good startups believe deeply that something is true about the future, and organize around it.

When we founded Aspen Mesh as a startup inside of F5, my co-founders and I believed these things about the future:

  1. App developers would accelerate their pace of innovation by modularizing and building APIs between modules packaged in containers.
  2. Kubernetes APIs would become the lingua franca for describing app and infrastructure deployments and Kubernetes would be the best platform for those APIs.
  3. The most important requirement for accelerating is to preserve control without hindering modularity, and that’s best accomplished as close to the app as possible.

We built Aspen Mesh to address item 3. If you boil down reams of pitch decks, board-of-directors updates, marketing and design docs dating back to summer of 2017, that's it. That's what we believe, and I still think we're right.

Aspen Mesh is a service mesh company, and the lowest levels of our product are the open-source service mesh Istio. Istio has plenty of fans and detractors; there are plenty of legitimate gripes and more than a fair share of uncertainty and doubt (as is the case with most emerging technologies). With that in mind, I want to share why we selected Istio and Envoy for Aspen Mesh, and why we believe more strongly than ever that they're the best foundation to build on.

 

Why a service mesh at all?

A service mesh is about connecting microservices. The acceleration we're talking about relies on applications that are built out of small units (predominantly containers) that can be developed and owned by a single team. Stitching these units into an overall application requires APIs between them. APIs are the contract. Service Mesh measures and assists contract compliance. 

There's more to it than reading the 12-factor app. All these microservices have to effectively communicate to actually solve a user's problem. Communication over HTTP APIs is well supported in every language and environment so it has never been easier to get started.  However, don't let the simplicity delude: you are now building a distributed system. 

We don't believe the right approach is to demand deep networking and infrastructure expertise from everyone who wants to write a line of code.  You trade away the acceleration enabled by containers for an endless stream of low-level networking challenges (as much as we love that stuff, our users do not). Instead, you should preserve control by packaging all that expertise into a technology that lives as close to the application as possible. For Kubernetes-based applications, this is a common communication enhancement layer called a service mesh.

How close can you get? Today, we see users having the most success with Istio's sidecar container model. We forecasted that in 2017, but we believe the concept ("common enhancement near the app") will outlive the technical details.

This common layer should observe all the communication the app is making; it should secure that communication and it should handle the burdens of discovery, routing, version translation and general interoperability. The service mesh simplifies and creates uniformity: there's one metric for "HTTP 200 OK rate", and it's measured, normalized and stored the same way for every app. Your app teams don't have to write that code over and over again, and they don't have to become experts in retry storms or circuit breakers. Your app teams are unburdened of infrastructure concerns so they can focus on the business problem that needs solving.  This is true whether they write their apps in Ruby, Python, node.js, Go, Java or anything else.

That's what a service mesh is: a communication enhancement layer that lives as close to your microservice as possible, providing a common approach to controlling communication over APIs.

 

Why Istio?

Just because you need a service mesh to secure and connect your microservices doesn't mean Envoy and Istio are the only choice.  There are many options in the market when it comes to service mesh, and the market still seems to be expanding rather than contracting. Even with all the choices out there, we still think Istio and Envoy are the best choice.  Here's why.

We launched Aspen Mesh after learning some lessons with a precursor product. We took what we learned, re-evaluated some of our assumptions and reconsidered the biggest problems development teams using containers were facing. It was clear that users didn't have a handle on managing the traffic between microservices and saw there weren't many using microservices in earnest yet so we realized this problem would get more urgent as microservices adoption increased. 

So, in 2017 we asked what would characterize the technology that solved that problem?

We compared our own nascent work with other purpose-built meshes like Linkerd (in the 1.0 Scala-based implementation days) and Istio, and non-mesh proxies like NGINX and HAProxy. This was long before service mesh options like Consul, Maesh, Kuma and OSM existed. Here's what we thought was important:

  • Kubernetes First: Kubernetes is the best place to position a service mesh close to your microservice. The architecture should support VMs, but it should serve Kubernetes first.
  • Sidecar "bookend" Proxy First: To truly offload responsibility to the mesh, you need a datapath element as close as possible to the client and server.
  • Kubernetes-style APIs are Key: Configuration APIs are a key cost for users.  Human engineering time is expensive. Organizations are judicious about what APIs they ask their teams to learn. We believe Kubernetes API design and mechanics got it right. If your mesh is deployed in Kubernetes, your API needs to look and feel like Kubernetes.
  • Open Source Fundamentals: Customers will want to know that they are putting sustainable and durable technology at the core of their architecture. They don't want a technical dead-end. A vibrant open source community ensures this via public roadmaps, collaboration, public security audits and source code transparency.
  • Latency and Efficiency: These are performance keys that are more important than total throughput for modern applications.

As I look back at our documented thoughts, I see other concerns, too (p99 latency in languages with dynamic memory management, layer 7 programmability). But the above were the key items that we were willing to bet on. So it became clear that we had to palace our bet on Istio and Envoy. 

Today, most of that list seems obvious. But in 2017, Kubernetes hadn’t quite won. We were still supporting customers on Mesos and Docker Datacenter. The need for service mesh as a technology pattern was becoming more obvious, but back then Istio was novel - not mainstream. 

I'm feeling very good about our bets on Istio and Envoy. There have been growing pains to be sure. When I survey the state of these projects now, I see mature, but not stagnant, open source communities.  There's a plethora of service mesh choices, so the pattern is established.  Moreover the continued prevalence of Istio, even with so many other choices, convinces me that we got that part right.

 

But what about...?

While Istio and Envoy are a great fit for all those bullets, there are certainly additional considerations. As with most concerns in a nascent market, some are legitimate and some are merely noise. I'd like to address some of the most common that I hear from conversations with users.

"I hear the control plane is too complex" - We hear this one often. It’s largely a remnant of past versions of Istio that have been re-architected to provide something much simpler, but there's always more to do. We're always trying to simplify. The two major public steps that Istio has taken to remedy this include removing standalone Mixer, and co-locating several control plane functions into a single container named istiod.

However, there's some stuff going on behind the curtains that doesn't get enough attention. Kubernetes makes it easy to deploy multiple containers. Personally, I suspect the root of this complaint wasn't so much "there are four running containers when I install" but "Every time I upgrade or configure this thing, I have to know way too many details."  And that is fixed by attention to quality and user-focus. Istio has made enormous strides in this area. 

"Too many CRDs" - We've never had an actual user of ours take issue with a CRD count (the set of API objects it's possible to define). However, it's great to minimize the number of API objects you may have to touch to get your application running. Stealing a paraphrasing of Einstein, we want to make it as simple as possible, but no simpler. The reality: Istio drastically reduced the CRD count with new telemetry integration models (from "dozens" down to 23, with only a handful involved in routine app policies). And Aspen Mesh offers a take on making it even simpler with features like SecureIngress that map CRDs to personas - each persona only needs to touch 1 custom resource to expose an app via the service mesh.

"Envoy is a resource hog" - Performance measurement is a delicate art. The first thing to check is that wherever you're getting your info from has properly configured the system-under-measurement.  Istio provides careful advice and their own measurements here.  Expect latency additions in the single-digit-millisecond range, knowing that you can opt parts of your application out that can't tolerate even that. Also remember that Envoy is doing work, so some CPU and memory consumption should be considered a shift or offload rather than an addition. Most recent versions of Istio do not have significantly more overhead than other service meshes, but Istio does provide twice as many feature, while also being available in or integrating with many more tools and products in the market. 

"Istio is only for really complicated apps” - Sure. Don’t use Istio if you are only concerned with a single cluster and want to offload one thing to the service mesh. People move to Kubernetes specifically because they want to run several different things. If you've got a Money-Making-Monolith, it makes sense to leave it right where it is in a lot of cases. There are also situations where ingress or an API gateway is all you need. But if you've got multiple apps, multiple clusters or multiple app teams then Kubernetes is a great fit, and so is a service mesh, especially as you start to run things at greater scale.

In scenarios where you need a service mesh, it makes sense to use the service mesh that gives you a full suite of features. A nice thing about Istio is you can consume it piecemeal - it does not have to be implemented all at once. So you only need mTLS and tracing now? Perfect. You can add mTLS and tracing now and have the option to add metrics, canary, traffic shifting, ingress, RBAC, etc. when you need it.

We’re excited to be on the Istio journey and look forward to continuing to work with the open source community and project to continue advancing service mesh adoption and use cases. If you have any particular question I didn’t cover, feel free to reach out to me at @notthatjenkins. And I'm always happy to chat about the best way to get started on or continue with service mesh implementation. 


When Do You Need A Service Mesh - Aspen Mesh

When Do You Need A Service Mesh?

When You Need A Service Mesh - Aspen MeshOne of the questions I often hear is: "Do I really need a service mesh?" The honest answer is "It depends." Like nearly everything in the technology space (or more broadly "nearly everything"), this depends on the benefits and costs. But after having helped users progress from exploration to production deployments in many different scenarios, I'm here to share my perspective on which inputs to include in your decision-making process.

A service mesh provides a consistent way to connect, secure and observe microservices. Most service meshes are tightly integrated with an orchestration platform, commonly Kubernetes. There's no way around it; a service mesh is another thing, and at least part of your team will have to learn it. That's a cost, and you should compare that cost to the benefits of operational simplification you may achieve.

But apart from costs and benefits, what should you be asking in order to determine if you really need a service mesh? The number of microservices you’re running, as well as urgency and timing, can have an impact on your needs.

How Many Microservices?

If you're deploying your first or second microservice, I think it is just fine to not have a service mesh. You should, instead, focus on learning Kubernetes and factoring stateless containers out of your applications first. You will naturally build familiarity with the problems that a service mesh can solve, and that will make you much better prepared to plan your service mesh journey when the time comes.

If you have an existing application architecture that provides the observability, security and resilience that you need, then you are already in a good place. For you, the question becomes when to add a service mesh. We usually see organizations notice the toil associated with utility code to integrate each new microservice. Once that toil gets painful enough, they evaluate how they could make that integration more efficient. We advocate using a service mesh to reduce this toil.

The exact point at which service mesh benefits clearly outweigh costs varies from organization to organization. In my experience, teams often realize they need a consistent approach once they have five or six microservices. However, many users push to a dozen or more microservices before they notice the increasing cost of utility code and the increasing complexity of slight differences across their applications. And, of course, some organizations continue scaling and never choose a service mesh at all, investing in application libraries and tooling instead. On the other hand, we also work with early birds that want to get ahead of the rising complexity wave and introduce service mesh before they've got half-a-dozen microservices. But the number of microservices you have isn’t the only part to consider. You’ll also want to consider urgency and timing. 

Urgency and Timing

Another part of the answer to “When do I need a service mesh?” includes your timing. The urgency of considering a service mesh depends on your organization’s challenges and goals, but can also be considered by your current process or state of operations. Here are some states that may reduce or eliminate your urgency to use a service mesh:

  1. Your microservices are all written in one language ("monoglot") by developers in your organization, building from a common framework.
  2. Your organization dedicates engineers to building and maintaining org-specific tooling and instrumentation that's automatically built into every new microservice.
  3. You have a partially or totally monolithic architecture where application logic is built into one or two containers instead of several.
  4. You release or upgrade all-at-once after a manual integration process.
  5. You use application protocols that are not served by existing service meshes (so usually not HTTP, HTTP/2, gRPC).

On the other hand, here are some signals that you will need a service mesh and may want to start evaluating or adopting early:

  1. You have microservices written in many different languages that may not follow a common architectural pattern or framework (or you're in the middle of a language/framework migration).
  2. You're integrating third-party code or interoperating with teams that are a bit more distant (for example, across a partnership or M&A boundary) and you want a common foundation to build on.
  3. Your organization keeps "re-solving" problems, especially in the utility code (my favorite example: certificate rotation, while important, is no scrum team's favorite story in the backlog).
  4. You have robust security, compliance or auditability requirements that span services.
  5. Your teams spend more time localizing or understanding a problem than fixing it.

I consider this last point the three-alarm fire that you need a service mesh, and it's a good way to return to the quest for simplification. When an application is failing to deliver a quality experience to its users, how does your team resolve it? We work with organizations that report that finding the problem is often the hardest and most expensive part. 

What Next?

Once you've localized the problem, can you alleviate or resolve it? It's a painful situation if the only fix is to develop new code or rebuild containers under pressure. That's where you see the benefit from keeping resiliency capabilities independent of the business logic (like in a service mesh).

If this story is familiar to you, you may need a service mesh right now. If you're getting by with your existing approach, that’s great. Just keep in mind the costs and benefits of what you’re working with, and keep asking:

  1. Is what you have right now really enough, or are spending too much time trying to find problems instead of developing and providing value for your customers?
  2. Are your operations working well with the number of microservices you have, or is it time to simplify?
  3. Do you have critical problems that a service mesh would address?

Keeping tabs on the answers to these questions will help you determine if — and when — you really need a service mesh.

In the meantime if you're interested in learning more about service mesh, check out The Complete Guide to Service Mesh.


Protocol Sniffing Service Mesh

Protocol Sniffing in Production

Istio 1.3 introduced a new capability to automatically sniff the protocol used when two containers communicate. This is a powerful benefit to easily get started with Istio, but it has some tradeoffs.  Aspen Mesh recommends that production deployments of Aspen Mesh (built on Istio) do not use protocol sniffing, and Aspen Mesh 1.3.3-am2 turns off protocol sniffing by default. This blog explains the tradeoffs and the reasoning we think turning off protocol sniffing is the better tradeoff.  

What Protocol Sniffing Is

Protocol sniffing predates Istio. For our purposes, we're going to define it as examining some communication stream and classifying it as implementing one protocol (like HTTP) or another (like SSH), without additional information. For example, here's two streams from client to server, if you've ever debugged these protocols you won't have a hard time telling them apart:

Protocol Sniffing Service Mesh

In an active service mesh, the Envoy sidecars will be handling thousands of these streams a second.  The sidecar is a proxy, so it reads every byte in the stream from one side, examines it, applies policy to it and then sends it on.  In order to apply proper policy ("Send all PUTs to /create_* to create-handler.foo.svc.cluster.local"), Envoy needs to understand the bytes it is reading.  Without protocol sniffing, that's done by configuring Envoy:

  • Layer 7 (HTTP): "All streams with a destination port of 8000 are going to follow the HTTP protocol"
  • Layer 4 (SSH): "All streams with a destination port of 22 are going to follow the SSH protocol"

When Envoy sees a stream with destination port 8000, it reads each byte and runs its own HTTP protocol implementation to understand those bytes and then apply policy.  Port 22 has SSH traffic; Envoy doesn't have an SSH protocol implementation so Envoy treats it as opaque TCP traffic. In proxies this is often called "Layer 4 mode" or "TCP mode"; this is when the proxy doesn't understand the higher-level protocol inside, so it can only apply a simpler subset of policy or collect a subset of telemetry.

For instance, Envoy can tell you how many bytes went over the SSH stream, but it can't tell you anything about whether those bytes indicated a successful SSH session or not.  But since Envoy can understand HTTP, it can say "90% of HTTP requests are successful and get a 200 OK response".

Here's an analogy - I speak English but not Italian; however, I can read and write the Latin alphabet that covers both.  So I could copy an Italian message from one piece of paper to another without understanding what's inside. Suppose I was your proxy and you said, "Andrew, copy all mail correspondence into email for me" - I could do that whether you received letters from English-speaking friends or Italian-speaking ones.  Now suppose you say, "Copy all mail correspondence into email unless it has Game of Thrones spoilers in it."  I can detect spoilers in English correspondence because I actually understand what's being said, but not Italian, where I can only copy the letters from one page to the other.

If I were a proxy, I'm a layer 7 English proxy but I only support Italian in layer 4 mode.

In Aspen Mesh and Istio, the protocol for a stream is configured in the Kubernetes service.  These are the options:

  • Specify a layer 7 protocol: Start the name of the service with a layer 7 protocol that Istio and Envoy understand, for example "http-foo" or "grpc-bar".
  • Specify a layer 4 protocol: Start the name of the service with a layer 4 protocol, for example "tcp-foo".  (You also use this if you know the layer 7 protocol but it's not one that Istio and Envoy support; for example, you might name a port "tcp-ssh")
  • Don't specify protocol at all: Name it without a protocol prefix, e.g. "clients".

If you don't specify a protocol at all, then Istio has to make a choice.  Before protocol sniffing was a feature, Istio chose to treat this with layer 4 mode.  Protocol sniffing is a new behavior that says, "try reading some of it - if it looks like a protocol you know, treat it like that protocol".

An important note here is that this sniffing applies for both passive monitoring and active management.  Istio both collects metrics and applies routing and policy. This is important because if a passive system has a sniff failure, it results only in a degradation of monitoring - details for a request may be unavailable.  But if an active system has a sniff failure, it may misapply routing or policy; it could send a request to the wrong service.

Benefits of Protocol Sniffing

The biggest benefit of protocol sniffing is that you don't have to specify the protocols.  Any communication stream can be sniffed without human intervention. If it happens to be HTTP, you can get detailed metrics on it.

That removes a significant amount of configuration burden and reduces time-to-value for your first service mesh install.  Drop it in and instantly get HTTP metrics.

Protocol Sniffing Failure Modes

However, as with most things, there is a tradeoff.  In some cases, protocol sniffing can produce results that might surprise you.  This happens when the sniffer classifies a stream differently than you or some other system would.

False Positive Match

This occurs when a protocol happens to look like HTTP, but the administrator doesn't want it to be treated by the proxy as HTTP.

One way this can happen is if the apps are speaking some custom protocol where the beginning of the communication stream looks like HTTP, but it later diverges.  Once it diverges and is no longer conforming to HTTP, the proxy has already begun treating it as HTTP and now must terminate the connection. This is one of the differences between passive sniffers and active sniffers - a passive sniffer could simply "cancel" sniffing.

Behavior Change:

  • Without sniffing: Stream is considered Layer 4 and completes fine.
  • With sniffing: Stream is considered Layer 7, and then when it later diverges, the proxy closes the stream.

False Negative Match

This occurs when the client and server think they are speaking HTTP, but the sniffer decides it isn't HTTP.  In our case, that means the sniffer downgrades to Layer 4 mode. The proxy no longer applies Layer 7 policy (like Istio's HTTP Authorization) or collects Layer 7 telemetry (like request success/failure counts).

One case where this occurs is when the client and server are both technically violating a specification but in a way that they both understand.  A classic example in the HTTP space is line termination - technically, lines in HTTP must be terminated with a CRLF; two characters 0x0d 0x0a.  But most proxies and web servers will also accept HTTP where lines are only terminated with LF (just the 0x0a character), because some ancient clients and hacked-together UNIX tools just sent LFs.

That example is usually harmless but a riskier one is if a client can speak something that looks like HTTP, that the server will treat as HTTP, but the sniffer will downgrade.  This allows the client to bypass any Layer 7 policies the proxy would enforce. Istio currently applies sniffing to outbound traffic where the outbound target is unknown (often occurs for Egress traffic) or the outbound target is a service port without a protocol annotation.

Here's an example: I know of two non-standard behaviors that node.js' HTTP server framework allows.  The first is allowing extra spaces between the Request-URI and the HTTP-Version in the Request-Line. The second is allowing spaces in a Header field-name.  Here's an example with the weird parts highlighted:

If I send this to a node.js server, it accepts it as a valid HTTP request (for the curious, the extra whitespace in the request line is dropped, and the whitespace in the Header field-name is included so the header is named "x-foo   bar"). Node.js' HTTP parser is taken from nginx which also accepts the extra spaces. Nginx is pretty darn popular so other web frameworks and a lot of servers accept this. Interestingly, so does the HTTP parser in Envoy (but not the HTTP inspector).

Suppose I have a situation like this:  We just added a new capability to delete in-progress orders to the beta version of our service, so we want all DELETE requests to be routed to "foo-beta" and all other normal requests routed to "foo".  We might write an Istio VirtualService to route DELETE requests like this:

If I send a request like this, it is properly routed to foo-2.

But if I send one like this, I bypass the route and go to foo-1.  Oops!

This means that clients can choose to "step around" routing if they can find requests that trick the sniffer. If those requests aren't accepted by the server at the other end, it should be OK.  However, if they are accepted by the server, bad things can happen. Additionally, you won't be able to audit or detect this case because you won't have Jaeger traces or access logs from the proxy since it thought the request wasn't HTTP.

(We investigated this particular case and ran our results past the Envoy and Istio security vulnerability teams before publishing this blog. While it didn't rise to the level of security issue, we want it to be obvious to our users what the tradeoffs are. While the benefits of protocol sniffing may be worthwhile in many cases, most users will want to avoid protocol sniffing in security-sensitive applications.)

Behavior Change:

  • Without sniffing: Stream is Layer 7 and invalid requests are consistently rejected.
  • With sniffing: Some streams may be classified as Layer 4 and bypass Layer 7 routing or policy.

Recommendation

Protocol sniffing lessens the configuration burden to get started with Istio, but creates uncertainty about behaviors.  Because this uncertainty can be controlled by the client, it can be surprising or potentially hazardous. In production, I'd prefer to tell the proxy everything I know and have the proxy reject everything that doesn't look as expected.  Personally, I like my test environments to look like my prod environments ("Test Like You Fly") so I'm going to also avoid sniffing in test.

I would use protocol sniffing when I first dropped a service mesh into an evaluation scenario, when I'm at the stage of, "Let's kick the tires and see what this thing can tell me about my environment."

For this reason, Aspen Mesh recommends users don't rely on protocol sniffing in production.  All service ports should be declared with a name that specifies the protocol (things like "http-app" or "tcp-custom").  Our users will continue to receive "vet" warnings for service ports that don't comply, so they can be confident that their clusters will behave predictably.


On Silly Animals and Gray Codes

I love Information Theory. This is a random rumination on surprise.  

Helm (v2) is a templating engine and release manager for Kubernetes.  Basically it lets you leverage the combined knowledge of experts on how you should configure container software, but still gives you nerd knobs you can tweak as needed. When Helm deploys software, it's called a release. You can name your releases, like ingress-controller-for-prod.  You'll use this name later: "Hey, Helm, how is ingress-controller-for-prod doing?" or "Hey, Helm, delete all the stuff you made for ingress-controller-for-prod."

If you don't name a release, Helm will make up a release name for you. It's a combination of an adjective and an animal:

"Monicker ships with a couple of word lists that were written and approved by a group of giggling school children (and their dad). We built a lighthearted list based on animals and descriptive words (mostly adjectives)."

So if you don't pick a name, Helm will pick one for you. You might get jaunty ferret or gauche octopus. Helm could have decided to pick unique identifiers, say UUIDs, so instead of jaunty ferret you get 9fa485b1-6e8b-47c4-baa1-3923394382a5 or e0c2def3-bc94-44ff-b702-985d4eb38ded. To Helm itself, the UUIDs would be fine. To the humans, though, I argue 9fa485b1-6e8b-47c4-baa1-3923394382a5 is a bad option because our brains aren't good handlers of long strings like 9fa485b1-6e8b-47c4-baa1-3932394382a5; it's hard to say 9fa485b1-6e8b-47e4-baa1-3923394382a5 and you're not even going to notice that I've actually subtly mixed up digits in 9fa485b1-6e8b-47c4-baa1-3923393482a5 through this entire paragraph.  But if I had mixed up jaunty ferret and jumpy ferret you at least stand a chance. This is true even though the bitwise difference between the inputs that generated jaunty ferret and jumpy ferret is actually smaller than my UUID tricks.

Humans are awful at handling arbitrarily long numbers. We can't fake them well. We get dazzled by them. We are miserable at comparing even short numbers, sometimes people die as a result.

So, if you're building identifiers into a system, you should consider if those are going to be seen by humans. And if so, I think you should make those identifiers suitable for humans: distinctive and pronounceable.

I've seen this used elsewhere; Docker does it for container names (but scientists and hackers instead of animals).  Netlify and Github will do it for project names.  LastPass has a "Pronounceable" option and pwgen walks a fine line; they explicitly trade a little entropy to avoid users "simply writ[ing] the password on a piece of paper taped to the monitor..." in the hell that is modern user/password management. I've also worked with a respected support organization that does this for customer issues (and all the humans seemed to be massively more effective IMing/emailing/Wiki-writing/Chatting in the hall about names instead of 10-digit numbers).

Aspen Mesh does this in a few places. The first benefit is some great GIFs. On our team, if Randy asks you to fix something in the "singing clams" object, he'll Slack you a GIF as well. The second benefit is distinctiveness - after you've seen a GIF of singing clams, the likelihood you accidentally delete the boasting aardvark object is basically nil. The likelihood that your dreams are haunted by singing clams is an entirely different concern.

via GIPHY

So I argue that replacing numbers with pronounceable and memorable human-language identifiers is great when we need things to be distinguishable and possible to remember. Humans are too easily tricked by subtle changes in long numbers.

An added bonus that we enjoy is that we bring some of our most meaningful cluster names to life at Aspen Mesh. Our first development cluster, our first production cluster and our first customer cluster all have a special place in our hearts. Naturally, we took those cluster names and made them into Aspen Mesh mascots:

  • jaunty-ferret
  • gauche-octopus
  • jolly-bat

Our cluster names make it easier for us to get development work done, and come with the added bonus of making the office more fun. If you want a set of these awesome cluster animals, leave a comment or tweet us @AspenMesh and we’ll send you a sticker pack. 


Why You Want Idempotency Anyway

We've been talking about how you can use a service mesh to do progressive delivery lately.  Progressive delivery fundamentally is about decoupling software delivery from user activation of said software.  Once decoupled, the user activation portion is under business control. It's early days, but the promise here is that software engineering can build new stuff as fast as they can, and your customer success team (already keeping a finger on the pulse of the users and market) can independently choose when to introduce what new features (and associated risk) to whom.

We've demonstrated how you can use Flagger (a progressive delivery Kubernetes operator) to command a service mesh to do canary deploys.  These help you activate functionality for only a subset of traffic and then progressively increase that subset as long as the new functionality is healthy.  We're also working on an enhancement to Flagger to do traffic mirroring. I think this is really cool because it lets you try out a feature activation without actually exposing any users to the impact.  It's a "pre-stage" to a canary deployment: Send one copy to the original service, one copy to the canary, and check if the canary responds as well as the original.

There's a caveat we bring up when we talk about this, however: Idempotency.  You can only do traffic mirroring if you're OK with duplicating requests, and sending one to the primary and one to the canary.  If your app and infrastructure is OK with duplicating these requests, they are said to be idempotent.

Idempotency

Idempotency is the ability to apply an operation more than once and not change the result.  In math, we'd say that:

f(f(a)) = f(a)

An example of a mathematical function that's idempotent is ABS(), the absolute value.

ABS(-3.7) = 3.7
ABS(ABS(3.7)) = ABS(3.7) = 3.7

We can repeat taking the absolute value of something as many times as we want, it won't change any more after the first time.  Similar things in your math toolbox: CEIL(), FLOOR(), ROUND(). But SQRT() is not in general idempotent. SQRT(16) = 4, SQRT(SQRT(16)) = 2, and so on.

For web apps, idempotent requests are those that can be processed twice without causing something invalid to happen.  So read-only operations like HTTP GETs are always idempotent (as long as they're actually only reads). Some kinds of writes are also idempotent: suppose I have a request that says "Set andrews_timezone to MDT".  If that request gets processed twice, my timezone gets set to MDT twice. That's OK. The first one might have changed it from PDT to MDT, and the second one "changes" it from MDT to MDT, so no change. But in the end, my timezone is MDT and so I'm good.

An example of a not-idempotent request is one that says "Deduct $100 from andrews_account".  If we apply that request twice, then my account will actually have $200 deducted from it and I don't want that.  You remember those e-commerce order pages that say "Don't click reload or you may be billed twice"? They need some idempotency!

This is important for traffic mirroring because we're going to duplicate the request and send one copy to the primary and one to the canary.  While idempotency is great for enabling this traffic mirroring case, I'm here to tell you why it's a great thing to have anyway, even if you're never going to do progressive delivery.

Exactly-Once Delivery and Invading Generals

There's a fundamental tension that emerges if you have distributed systems that communicate over an unreliable channel.  You can never be sure that the other side received your message exactly once.  I'll retell the parable of the Invading Generals as it was told to me the first time I was sure I had designed a system that solved this paradox.

There is a very large invading army that has camped in a valley for the night.  The defenders are split into two smaller armies that surround the invading army; one set of defenders on the eastern edge of the valley and one on the west.  If both defending armies attack simultaneously, they will defeat the invaders. However, if only one attacks, it will be too small; the invaders will defeat it, and then turn to the remaining defenders at their leisure and defeat them as well.

The general for the eastern defenders needs to coordinate with the general for the western defenders for simultaneous attack.  Both defenders can send spies with messages through the valley, as many as they want. Each spy has a 10% chance of being caught by the invaders and killed before his message is delivered, and a 90% chance of successfully delivering the message.  You are the general for the eastern defenders: what message will you send to the western defenders to guarantee you both attack simultaneously and defeat the invaders?

Turns out, there is no guaranteed safe approach.  Let's go through it. First, let's send this message:

"Western General, I will attack at dawn and you must do the same."

- Eastern General

There's a 90% chance that your spy will get through, but there's a 10% chance that he won't, only you will attack and you will lose to the invaders.  The problem statement said you have infinite spies, we must be able to do better!

OK, let's send lots of spies with the same message.  Then our probability of success is 1-0.1^n, where n is the number of spies.  So we can asymptotically approach 100% probability that the other side agrees, but we can never be sure.

How about this message:

"Western General, I am prepared to attack at dawn.  Send me a spy confirming that you have received this message so I know you will also attack.  If I don't receive confirmation, I won't attack because my army will be defeated if I attack alone."

- Eastern General

Now, if you don't receive a spy back from the western general you'll send another, and another, until you get a response.  But.... put yourself in the shoes of the western general. How does the western general know that you'll receive the confirmation spy?  Should the western army attack at dawn? What if the confirmation spy was caught and now only the western army attacks, ensuring defeat?

The western general could send lots of confirmation spies, so there is a high probability that at least one gets through.  But they can't guarantee with 100% probability that one gets through.

The western general could also send this response:

"Eastern General, we have received your spy.  We are also prepared to attack at dawn. We will be defeated if you do not also attack, and I know you won't attack if you don't know that we have received your message.  Please send back a spy confirming that you have received my confirmation or else we will not attack because we will be destroyed."

 

- Western General

A confirmation of a confirmation! (In networking ARQ terms, an ACK-of-an-ACK).  Again, this can reduce probability but cannot provide guarantees: we can keep shifting uncertainty between the Eastern and Western generals but never eliminate it.

Engineering Approaches

Okay, we can't know for sure that our message is delivered exactly once (regardless of service mesh or progressive delivery or any of that), so what are we going to do?  There are a few approaches:

• Retry naturally-idempotent requests

• Uniquefy requests

• Conditional updates

• Others

Retry Naturally-Idempotent Requests

If you have a request that is naturally idempotent, like getting the temperature on a thermostat, the end user can just repeat it if they didn't get the response they want.

Uniqueify Requests

Another approach is to make requests unique at the client, and then have all the other services avoid processing the same unique request twice.  One way to do this is to invent a UUID at the client and then have servers remember all the UUIDs they've already seen. My deduction request would then look like:

This is unique request f41182d1-f4b2-49ec-83cc-f5a8a06882aa.
If you haven't seen this request before, deduct $100 from andrews_account.

Then you can submit this request as many times as you want to the processor, and the processor can check if it's handled "f41182d1-f4b2-49ec-83cc-f5a8a06882aa" before.  There are a few caveats here.

First you have to have a way to generate unique identifiers.  UUIDs are pretty good but theoretically there's an extremely small possibility of UUID collision; practically there's a couple of minor foot-guns to watch out for like generating UUIDs on two VMs or containers that both have fake virtual MAC addresses that match.  You can also have the server make the unique identifier for you (it could be an auto-generated primary key in a database that is guaranteed to be unique).

Second your server has to remember all the UUIDs that you have processed.  Typically you put these in a database (maybe using UUID as a primary key anyway).  If the record of processed UUIDs is different than the action you take when processing, there's still a "risk window": you might commit a UUID and then fail to process it, or you might process it and fail to commit the UUID.  Algorithms like two-phase commit and paxos can help close the risk window.

Conditional Updates

Another approach is to include information in the request about what things looked like when the client sent the request, so that the server can abort the request if something has changed.  This includes the case that the "change" is a duplicate request and we've already processed it.

For instance, maybe my bank ledger looks like this:

Then I would make my request look like:

As long as the last transaction in andrews_account is number 563,
Create entry 564: Deduct $100 from andrews_account

If this request gets duplicated, the first will succeed and the second will fail.  After the first:

The duplicated request will fail:

As long as the last transaction in andrews_account is number 563,
Create entry 564: Deduct $100 from andrews_account

In this case the server could respond to the first copy with "Success" and the second copy with a soft failure like "Already committed" or just tell the client to read and notice that its update already happened.  MongoDB, AWS Dynamo and others support these kinds of conditional updates.

Others

There are many practical approaches to this problem.  I recommend doing some initial reasoning about idempotency, and then try to shift as much functionality as you can to the database or persistent state layer you're using.  While I gave a quick tour of some of the things involved in idempotency, there are a lot of other tricks like write-ahead journalling, conflict-free replicated data types and others that can enhance reliability.

Conclusion

Traffic mirroring is a great way to exercise canaries in a production environment before exposing them to your users.  Mirroring makes a duplicate of each request and sends one copy to the primary, one copy to the new canary version of your microservice.  This means that you must use mirroring only for idempotent requests: requests that can be applied twice without causing something erroneous to happen.

This caveat probably exists even if you aren't doing traffic mirroring, because networks fail.  The Eastern General and Western General can never really be sure their messages are delivered exactly once, there will always be a case where they may have to retry.  I think you want to build idempotency wherever possible, and then you should use traffic mirroring to test your canary deployments.


Important Security Updates in Aspen Mesh 1.1.13

Aspen Mesh is announcing the release of 1.1.13 which addresses important Istio security updates.  Below are the details of the security fixes taken from Istio 1.1.13 security update.

ISTIO-SECURITY-2019-003: An Envoy user reported publicly an issue (c.f. Envoy Issue 7728) about regular expressions matching that crashes Envoy with very large URIs.

  • CVE-2019-14993: After investigation, the Istio team has found that this issue could be leveraged for a DoS attack in Istio, if users are employing regular expressions in some of the Istio APIs: JWT, VirtualService, HTTPAPISpecBinding, QuotaSpecBinding .

ISTIO-SECURITY-2019-004: Envoy, and subsequently Istio are vulnerable to a series of trivial HTTP/2-based DoS attacks:

  • CVE-2019-9512: HTTP/2 flood using PING frames and queuing of response PING ACK frames that results in unbounded memory growth (which can lead to out of memory conditions).
  • CVE-2019-9513: HTTP/2 flood using PRIORITY frames that results in excessive CPU usage and starvation of other clients.
  • CVE-2019-9514: HTTP/2 flood using HEADERS frames with invalid HTTP headers and queuing of response RST_STREAM frames that results in unbounded memory growth (which can lead to out of memory conditions).
  • CVE-2019-9515: HTTP/2 flood using SETTINGS frames and queuing of SETTINGS ACK frames that results in unbounded memory growth (which can lead to out of memory conditions).
  • CVE-2019-9518: HTTP/2 flood using frames with an empty payload that results in excessive CPU usage and starvation of other clients.
  • See this security bulletin for more information

The 1.1.13 binaries are available for download here 

Upgrading procedures of Aspen Mesh deployments installed via Helm (helm install) please visit our Getting Started page. 

 


Why Is Policy Hard?

Aspen Mesh spends a lot of time talking to users about policy, even if we don’t always start out calling it that. A common pattern we see with clients is:

  1. Concept: "Maybe I want this service mesh thing"
  2. Install: "Ok, I've got Aspen Mesh installed, now what?"
  3. Observe: "Ahhh! Now I see how my microservices are communicating.  Hmmm, what's that? That pod shouldn't be talking to that database!"
  4. Act: "Hey mesh, make sure that pod never talks to that database"

The Act phase is interesting, and there’s more to it than might be obvious at first glance. I'll propose that in this blog, we work through some thought experiments to delve into how service mesh can help you act on insights from the mesh.

First, put yourself in the shoes of the developer that just found out their test pod is accidentally talking to the staging database. (Ok, you're working from home today so you don't have to put on shoes; the cat likes sleeping on your shoeless feet better anyways.) You want to control the behavior of a narrow set of software for which you're the expert; you have local scope and focus.

Next, put on the shoes of a person responsible for operating many applications; people we talk to often have titles that include Platform, SRE, Ops, Infra. Each day they’re diving into different applications so being able to rapidly understand applications is key. A consistent way of mapping across applications, datacenters, clouds, etc. is critical. Your goal is to reduce "snowflake architecture" in favor of familiarity to make it easier when you do have to context switch.

Now let's change into the shoes of your org's Compliance Officer. You’re on the line for documenting and proving that your platform is continually meeting compliance standards. You don't want to be the head of the “Department of No”, but what’s most important to you is staying out of the headlines. A great day at work for you is when you've got clarity on what's going on across lots of apps, databases, external partners, every source of data your org touches AND you can make educated tradeoffs to help the business move fast with the right risk profile. You know it’s ridiculous to be involved in every app change, so you need separation-of-concerns.

I'd argue that all of these people have policy concerns. They want to be able to specify their goals at a suitably high level and leave the rote and repetitive portions to an automated system.  The challenging part is there's only one underlying system ("the kubernetes cluster") that has to respond to each of these disparate personas.

So, to me policy is about transforming a bunch of high-level behavioral prescriptions into much lower-level versions through progressive stages. Useful real-world policy systems do this in a way that is transparent and understandable to all users, and minimizes the time humans spend coordinating. Here's an example "day-in-the-life" of a policy:

At the top is the highest level goal: "Devs should test new code without fear". Computers are hopeless to implement this. At the bottom is a rule suitable for a computer like a firewall to implement.

The layers in the middle are where a bad policy framework can really hurt. Some personas (the hypothetical Devs) want to instantly jump to the bottom - they're the "4.3.2.1" in the above example. Other personas (the hypothetical Compliance Officer) is way up top, going down a few layers but not getting to the bottom on a day-to-day basis.

I think the best policy frameworks help each persona:

  • Quickly find the details for the layer they care about right now.
  • Help them understand where did this come from? (connect to higher layers)
  • Help them understand is this doing what I want? (trace to lower layers)
  • Know where do I go to change this? (edit/create policy)

As an example, let's look at iptables, one of the firewalling/packet mangling frameworks for Linux.  This is at that bottom layer in my example stack - very low-level packet processing that I might look at if I'm an app developer and my app's traffic isn't doing what I'd expect.  Here's an example dump:


root@kafka-0:/# iptables -n -L -v --line-numbers -t nat
Chain PREROUTING (policy ACCEPT 594K packets, 36M bytes)
num   pkts bytes target     prot opt in out   source destination
1     594K 36M ISTIO_INBOUND  tcp -- * * 0.0.0.0/0            0.0.0.0/0

Chain INPUT (policy ACCEPT 594K packets, 36M bytes)
num   pkts bytes target     prot opt in out   source destination

Chain OUTPUT (policy ACCEPT 125K packets, 7724K bytes)
num   pkts bytes target     prot opt in out   source destination
1      12M 715M ISTIO_OUTPUT  tcp -- * * 0.0.0.0/0            0.0.0.0/0

Chain POSTROUTING (policy ACCEPT 12M packets, 715M bytes)
num   pkts bytes target     prot opt in out   source destination

Chain ISTIO_INBOUND (1 references)
num   pkts bytes target     prot opt in out   source destination
1        0 0 RETURN     tcp -- * *     0.0.0.0/0 0.0.0.0/0            tcp dpt:22
2     594K 36M RETURN     tcp -- * *   0.0.0.0/0 0.0.0.0/0            tcp dpt:15020
3        2 120 ISTIO_IN_REDIRECT  tcp -- * * 0.0.0.0/0            0.0.0.0/0

Chain ISTIO_IN_REDIRECT (1 references)
num   pkts bytes target     prot opt in out   source destination
1        2 120 REDIRECT   tcp -- * *     0.0.0.0/0 0.0.0.0/0            redir ports 15006

Chain ISTIO_OUTPUT (1 references)
num   pkts bytes target     prot opt in out   source destination
1      12M 708M ISTIO_REDIRECT  all -- * lo 0.0.0.0/0           !127.0.0.1
2        7 420 RETURN     all -- * *     0.0.0.0/0 0.0.0.0/0            owner UID match 1337
3        0 0 RETURN     all -- * *     0.0.0.0/0 0.0.0.0/0            owner GID match 1337
4     119K 7122K RETURN     all -- * *   0.0.0.0/0 127.0.0.1
5        4 240 ISTIO_REDIRECT  all -- * * 0.0.0.0/0            0.0.0.0/0

Chain ISTIO_REDIRECT (2 references)
num   pkts bytes target     prot opt in out   source destination
1      12M 708M REDIRECT   tcp -- * *   0.0.0.0/0 0.0.0.0/0            redir ports 15001


This allows me to quickly understand a lot of details about what is happening at this layer. Each rule specification is on the right-hand side and is relatively intelligible to the personas that operate at this layer. On the left, I get "pkts" and "bytes" - this is a count of how many packets have triggered each rule, helping me answer "Is this doing what I want it to?". There's even more information here if I'm really struggling: I can log the individual packets that are triggering a rule, or mark them in a way that I can capture them with tcpdump.  

Finally, furthest on the left in the "num" column is a line number, which is necessary if I want to modify or delete rules or add new ones before/after; this is a little bit of help for "Where do I go to change this?". I say a little bit because in most systems that I'm familiar with, including the one I grabbed that dump from, iptables rules are produced by some program or higher-level system; they're not written by a human. So if I just added a rule, it would only apply until that higher-level system intervened and changed the rules (in my case, until a new Pod was created, which can happen at any time). I need help navigating up a few layers to find the right place to effect the change.

iptables lets you organize groups of rules into your own chains, in this case the name of the chain (ISTIO_***) is a hint that Istio produced this and so I've got a hint on what higher layer to examine.

For a much different example, how about the Kubernetes CI Robot (from Prow)? If you've ever made a PR to Kubernetes or many other CNCF projects, you likely interacted with this robot. It's an implementer of policy; in this case the policies around changing source code for Kubernetes.  One of the policies it manages is compliance with the Contributor's License Agreement; contributors agree to grant some Intellectual Property rights surrounding their contributions. If k8s-ci-robot can't confirm that everything is alright, it will add a comment to your PR:

This is much different than firewall policy, but I say it's still policy and I think the same principles apply. Let's explore. If you had to diagram the policy around this, it would start at the top with the legal principle that Kubernetes wants to make sure all the software under its umbrella has free and clear IP terms. Stepping down a layer, the Kubernetes project decided to satisfy that requirement by requiring a CLA for any contributions. So on until we get to the bottom layer, the code that implements the CLA check.

As an aside, the code that implements the CLA check is actually split into two halves: first there's a CI job that actually checks the commits in the PR against a database of signed CLAs, and then there's code that takes the result of that job and posts helpful information for users to resolve any issues. That's not visible or important at that top layer of abstraction (the CNCF lawyers shouldn't care).

This policy structure is easy to navigate. If your CLA check fails, the comment from the robot has great links. If you're an individual contributor you can likely skip up a layer, sign the CLA and move on. If you're contributing on behalf of a company, the links will take you to the document you need to send to your company's lawyers so they can sign on behalf of the company.

So those are two examples of policy. You probably encounter many other ones every day from corporate travel policy to policies (written, unwritten or communicated via email missives) around dirty dishes.

It's easy to focus on the technical capabilities of the lowest levels of your system. But I'd recommend that you don't lose focus on the operability of your system. It’s important that it be transparent and easy to understand. Both the iptables and k8s-ci-robot are transparent. The k8s-ci-robot has an additional feature: it knows you're probably wondering "Where did this come from?" and it answers that question for you. This helps you and your organization navigate the layers of policy. 

When implementing service mesh to add observability, resilience and security to your Kubernetes clusters, it’s important to consider how to set up policy in a way that can be navigated by your entire team. With that end in mind, Aspen Mesh is building a policy framework for Istio that makes it easy to implement policy and understand how it will affect application behavior.

Did you like this blog? Subscribe to get email updates when new Aspen Mesh blogs go live.


Securing Containerized Applications With Service Mesh

The self-contained, ephemeral nature of microservices comes with some serious upside, but keeping track of every single one is a challenge, especially when trying to figure out how the rest are affected when a single microservice goes down. The end result is that if you’re operating or developing a microservices architecture, there’s a good chance part of your days are spent wondering what your services are up to.

With the adoption of microservices, problems also emerge due to the sheer number of services that exist in large systems. Problems like security, load balancing, monitoring and rate limiting that had to be solved once for a monolith, now have to be handled separately for each service.

The technology aimed at addressing these microservice challenges has been  rapidly evolving:

  1. Containers facilitate the shift from monolith to microservices by enabling independence between applications and infrastructure.
  2. Container orchestration tools solve microservices build and deploy issues, but leave many unsolved runtime challenges.
  3. Service mesh addresses runtime issues including service discovery, load balancing, routing and observability.

Securing Services with a Service Mesh

A service mesh provides an advanced toolbox that lets users add security, stability and resiliency to containerized applications. One of the more common applications of a service mesh is bolstering cluster security. There are 3 distinct capabilities provided by the mesh that enable platform owners to create a more secure architecture.

Traffic Encryption  

As a platform operator, I need to provide encryption between services in the mesh. I want to leverage mTLS to encrypt traffic between services. I want the mesh to automatically encrypt and decrypt requests and responses, so I can remove that burden from my application developers. I also want it to improve performance by prioritizing the reuse of existing connections, reducing the need for the computationally expensive creation of new ones. I also want to be able to understand and enforce how services are communicating and prove it cryptographically.

Security at the Edge

As a platform operator, I want Aspen Mesh to add a layer of security at the perimeter of my clusters so I can monitor and address compromising traffic as it enters the mesh. I can use the built in power of Kubernetes as an ingress controller to add security with ingress rules such as allowlisting and denylisting. I can also apply service mesh route rules to manage compromising traffic at the edge. I also want control over egress so I can dictate that our network traffic does not go places it shouldn't (denylist by default and only talk to what you allowlist).

Role Based Access Control (RBAC)

As the platform operator, It’s important that I am able to provide the level of least privilege so the developers on my platform only have access to what they need, and nothing more. I want to enable controls so app developers can write policy for their apps and only their apps so that they can move quickly without impacting other teams. I want to use the same RBAC framework that I am familiar with to provide fine-grained RBAC within my service mesh.

How a Service Mesh Adds Security

You’re probably thinking to yourself, traffic encryption and fine-grained RBAC sound great, but how does a service mesh actually get me to them? Service meshes that leverage a sidecar approach are uniquely positioned intercept and encrypt data. A sidecar proxy is a prime insertion point to ensure that every service in a cluster is secured, and being monitored in real-time. Let’s explore some details around why sidecars are a great place for security.

Sidecar Is a Great Place for Security

Securing applications and infrastructure has always been daunting, in part because the adage really is true: you are only as secure as your weakest link.  Microservices are an opportunity to improve your security posture but can also cut the other way, presenting challenges around consistency.  For example, the best organizations use the principle of least privilege: an app should only have the minimum amount of permissions and privilege it needs to get its job done.  That's easier to apply where a small, single-purpose microservice has clear and narrowly-scoped API contracts.  But there's a risk that as application count increases (lots of smaller apps), this principle can be unevenly applied. Microservices, when managed properly, increase feature velocity and enable security teams to fulfill their charter without becoming the Department of No.

There's tension: Move fast, but don't let security coverage slip through the cracks.  Prefer many smaller things to one big monolith, but secure each and every one.  Let each team pick the language of their choice, but protect them with a consistent security policy.  Encourage app teams to debug, observe and maintain their own apps but encrypt all service-to-service communication.

A sidecar is a great way to balance these tensions with an architecturally sound security posture.  Sidecar-based service meshes like Istio and Linkerd 2.0 put their datapath functionality into a separate container and then situate that container as close to the application they are protecting as possible.  In Kubernetes, the sidecar container and the application container live in the same Kubernetes Pod, so the communication path between sidecar and app is protected inside the pod's network namespace; by default it isn't visible to the host or other network namespaces on the system.  The app, the sidecar and the operating system kernel are involved in communication over this path.  Compared to putting the security functionality in a library, using a sidecar adds the surface area of kernel loopback networking inside of a namespace, instead of just kernel memory management.  This is additional surface area, but not much.

The major drawbacks of library approaches are consistency and sprawl in polyglot environments.  If you have a few different languages or application frameworks and take the library approach, you have to secure each one.  This is not impossible, but it's a lot of work.  For each different language or framework, you get or choose a TLS implementation (perhaps choosing between OpenSSL and BoringSSL).  You need a configuration layer to load certificates and keys from somewhere and safely pass them down to the TLS implementation.  You need to reload these certs and rotate them.  You need to evaluate "information leakage" paths: does your config parser log errors in plaintext (so it by default might print the TLS key to the logs)?  Is it OK for app core dumps to contain these keys?  How often does your organization require re-keying on a connection?  By bytes or time or both?  Minimum cipher strength?  When a CVE in OpenSSL comes out, what apps are using that version and need updating?  Who on each app team is responsible for updating OpenSSL, and how quickly can they do it?  How many apps have a certificate chain built into them for consuming public websites even if they are internal-only?  How many Dockerfiles will you need to update the next time a public signing authority has to revoke one?  slowloris?

Your organization can do all this work.  In fact, parts probably already have - above is our list of painful app security experiences but you probably have your own additions.  It is a lot of cross-organizational effort and process to get it right.  And you have to get it right everywhere, or your weakest link will be exploited.  Now with microservices, you have even more places to get it right.  Instead, our advice is to focus on getting it right once in the sidecar, and then distributing the sidecar everywhere, and get back to adding business value instead of duplicating effort.

There are some interesting developments on the horizon like the use of kernel TLS to defer bulk and some asymmetric crypto operations to the kernel.  That's great:  Implementations should change and evolve.  The first step is providing a good abstraction so that apps can delegate to lower layers. Once that's solid, it's straightforward to move functionality from one layer to the next as needed by use case, because you don't perturb the app any more.  As precedent, consider TCP Segmentation Offload, which lets the network card manage splitting app data into the correct size for each individual packet.  This task isn't impossible for an app to do, but it turns out to be wasted effort.  By deferring TCP segmentation to the kernel, it left the realm of the app.  Then, kernels, network drivers, and network cards were free to focus on the interoperability and semantics required to perform TCP segmentation at the right place.  That's our position for this higher-level service-to-service communication security: move it outside of the app to the sidecar, and then let sidecars, platforms, kernels and networking hardware iterate.

Envoy Is a Great Sidecar

We use Envoy as our sidecar because it's lightweight, has some great features and good API-based configurability.  Here are some of our favorite parts about Envoy:

  • Configurable TLS Parameters: Envoy exposes all the TLS configuration points you'd expect (cipher strength, protocol versions, curves).  The advantage to using Envoy is that they're configured the same way for every app using the sidecar.
  • Mutual TLS: Typically TLS is used to authenticate the server to the client, and to encrypt communication.  What's missing is authenticating the client to the server - if you do this, then the server knows what is talking to it.  Envoy supports this bi-directional authentication out of the box, which can easily be incorporated into a SPIFFE system.  In today's complex and cloud datacenter, you're better off if you trust things based on cryptographic proof of what they are, instead of network perimeter protection of where they called from.
  • BoringSSL: This fork of OpenSSL removed huge amounts of code like implementations of obsolete ciphers and cleaned up lots of vestigial implementation details that had repeatedly been the source of security vulnerabilities.  It's a good default choice if you don't need any OpenSSL-specific functionality because it's easier to get right.
  • Security Audit: A security audit can't prove the absence of vulnerabilities but it can catch mistakes that demonstrate either architectural weaknesses or implementation sloppiness.  Envoy's security audit did find issues but in our opinion indicated a high level of security health.
  • Fuzzed and Bountied: Envoy is continuously fuzzed (exposed to malformed input to see if it crashes) and covered by Google's Patch Reward security bug bounty program.
  • Good API Granularity: API-based configuration doesn't mean "just serialize/deserialize your internal state and go."  Careful APIs thoughtfully map to the "personas" of what's operating them (even if those personas are other programs).  Envoy's xDS APIs in our experience partition routing behavior from cluster membership from secrets.  This makes it easy to make well-partitioned controllers.  A knock-on benefit is that it is easy in our experience to debug and test Envoy because config constructs usually map pretty clearly to code constructs.
  • No garbage collector: There are great languages with automatic memory management like Go that we use every day.  But we find languages like C++ and Rust provide predictable and optimizable tail latency.
  • Native Extensibility via Filters: Envoy has layer 4 and layer 7 extension points via filters that are written in C++ and linked into Envoy.
  • Scripting Extensibility via Lua: You can write Lua scripts as extension points as well.  This is very convenient for rapid prototyping and debugging.

One of these benefits deserves an even deeper dive in a security-oriented discussion.  The API granularity of Envoy is based on a scheme called "xDS" which we think of as follows:  Logically split the Envoy config API based on the user of that API.  The user in this case is almost always some other program (not a human), for instance a Service Mesh control plane element.

For instance, in xDS listeners ("How should I get requests from users?") are separated from clusters ("What pods or servers are available to handle requests to the shoppingcart service?").  The "x" in "xDS" is replaced with whatever functionality is implemented ("LDS" for listener discovery service).  Our favorite security-related partitioning is that the Secret Discovery Service can be used for propagating secrets to the sidecars independent of the other xDS APIs.

Because SDS is separate, the control plane can implement the Principle of Least Privilege: nothing outside of SDS needs to handle or have access to any private key material.

Mutual TLS is a great enhancement to your security posture in a microservices environment.  We see mutual TLS adoption as gradual - almost any real-world app will have some containerized microservices ready to join the service mesh and mTLS on day one.  But practically speaking, many of these will depend on mesh-external services, containerized or not.  It is possible in most cases to integrate these services into the same trust domain as the service mesh, and oftentimes these components can even participate in client TLS authentication so you get true mutual TLS.

In our experience, this happens by gradually expanding the "circle" of things protected with mutual TLS.  First, stateless containerized business logic, next in-cluster third party services, finally external state stores like bare metal databases.  That's why we focus on making the state of mTLS easy to understand in Aspen Mesh, and provide assistants to help you detect configuration mishaps.

What Lives Outside the Sidecar?

You need a control plane to configure all of these sidecars.  In some simple cases it may be tempting to do this with some CI integration to generate configs plus DNS-based discovery.  This is viable but it's hard to do rapid certificate rotation.  Also, it leaves out more dynamic techniques like canaries, progressive delivery and A/B testing.  For this reason, we think most real-world applications will include an online control plane that should:

  • Disseminate configuration to each of the sidecars with a scalable approach.
  • Rotate sidecar certificates rapidly to reduce the value to an attacker of a one-time exploit of an application.
  • Collect metadata on what is communicating with what.

A good security posture means you should be automating some work on top of the control plane. We think these things are important (and built them into Aspen Mesh):

  • Organizing information to help humans narrow in on problems quickly.
  • Warning on potential misconfigurations.
  • Alerting when unhealthy communication is observed.
  • Inspect the firehose of metadata for surprises - these patterns could be application bugs or security issues or both.

If you’re considering or going down the Kubernetes path, you should be thinking about the unique security challenges that comes with microservices running in a Kubernetes cluster. Kubernetes solves many of these, but there are some critical runtime issues that a service mesh can make easier and more secure. If you would like to talk about how the Aspen Mesh platform and team can address your specific security challenge, feel free to find some time to chat with us.  Or to learn more, get the free white paper on achieving Zero-trust security for containerized applications here.


To Multicluster, or Not to Multicluster: Solving Kubernetes Multicluster Challenges with a Service Mesh

If you are going to be running multiple clusters for dev and organizational reasons, it’s important to understand your requirements and decide whether you want to connect these in a multicluster environment and, if so, to understand various approaches and associated tradeoffs with each option.

Kubernetes has become the container orchestration standard, and many organizations are currently running multiples clusters. But while communication issues within clusters are largely solved, communication across clusters is still a major challenge for most organizations.

Service mesh helps to address multicluster challenges. Start by identifying what you want, then shift to how to get it. We recommend understanding your specific communication use case, identifying your goals, then creating an implementation plan.

Multicluster offers a number of benefits:

  • Single pane of glass
  • Unified trust domain
  • Independent fault domains
  • Intercluster traffic
  • Heterogenous/non-flat network

Which can be achieved with various approaches:

  • Independent clusters
  • Common management
  • Cluster-aware service routing through gateways
  • Flat network
  • Split-horizon Endpoints Discovery Service (EDS)

If you have decided to multicluster, your next move is deciding the best implementation method and approach for your organization. A service mesh like Istio can help, and when used properly can make multicluster communication painless.

Read the full article here on InfoQ’s site.