Service Mesh Security: Addressing Attack Vectors with Istio

As you break apart your monolith into microservices, you'll gain a slew of advantages such as scalability, increased uptime and better fault isolation. A downside of breaking applications apart into smaller services is that there is a greater area for attack. Additionally, all the communication that used to take place via function calls within the monolith is now exposed to the network. Adding security that addresses this must be a core consideration on your microservices journey.

One of the key benefits of Istio, the open source service mesh that Aspen Mesh is built on, is that it provides unique service mesh security and policy enforcement to microservices. An important thing to note is that while a service mesh adds several important security features, it is not the end-all-be-all for microservices security. It’s important to also consider a strategy around network security (a good read on how the network can help manage microservices), which can detect and neutralize attacks on the service mesh infrastructure itself, to ensure you’re entirely protected against today’s threats.

So let’s look at the attack vectors that Istio addresses, which include traffic control at the edge, traffic encryption within the mesh and layer-7 policy control.

Security at the Edge
Istio adds a layer of security that allows you to monitor and address compromising traffic as it enters the mesh. Istio integrates with Kubernetes as an ingress controller and takes care of load balancing for ingress. This allows you to add a level of security at the perimeter with ingress rules. You can apply monitoring around what is coming into the mesh and use route rules to manage compromising traffic at the edge.

To ensure that only authorized users are allowed in, Istio’s Role-Based Access Control (RBAC) provides flexible, customizable control of access at the namespace-level, service-level and method-level for services in the mesh. RBAC provides two distinct capabilities: the RBAC engine watches for changes on RBAC policy and fetches the updated RBAC policy if it sees any changes, and authorizes requests at runtime, by evaluating the request context against the RBAC policies, and returning the authorization result.

Encrypting Traffic
Security at the edge is a good start, but if a malicious actor gets through, Istio provides defense with mutual TLS encryption of the traffic between your services. The mesh can automatically encrypt and decrypt requests and responses, removing that burden from the application developer. It can also improve performance by prioritizing the reuse of existing, persistent connections, reducing the need for the computationally expensive creation of new ones.

Istio provides more than just client server authentication and authorization, it allows you to understand and enforce how your services are communicating and prove it cryptographically. It automates the delivery of the certificates and keys to the services, the proxies use them to encrypt the traffic (providing mutual TLS), and periodically rotates certificates to reduce exposure to compromise. You can use TLS to ensure that Istio instances can verify that they’re talking to other Istio instances to prevent man-in-the-middle attacks.

Istio makes TLS easy with Citadel, the Istio Auth controller for key management. It allows you to secure traffic over the wire and also make strong identity-based authentication and authorization for each microservice.

Policy Control and Enforcement
Istio gives you the ability to enforce policy at the application level with layer-7 level control. Applying policy at the this level is ideal for service routing, retries, circuit-breaking, and for security that operates at the application layer, such as token validation. Istio provides the ability to set up whitelists and blacklists so you can let in what you know is safe and keep out what you know isn’t.

Istio’s Mixer enables integrating extensions into the system and lets you declare policy constraints on network, or service behavior, in a standardized expression language. The benefit is that you can funnel all of those things through a common API which enables you to cache policy decisions at the edge of the service so, if the downstream policy systems start to fail, the network stays up.

Istio addresses some key concerns that arise with microservices. You can make sure that only the services that are supposed to talk to each other are talking to each other. You can encrypt those communications to secure against attacks that can occur when those services interact, and you can apply application-wide policy. While there are other, manual, ways to accomplish much of this, the beauty of a mesh is that is brings several capabilities together and lets you apply them in a manner that is scalable.

At Aspen Mesh, we’re working on some new capabilities to help you get the most out of the security features in Istio. We’ll be posting something on that in the near future so check back in on the Aspen Mesh blog. Or to learn more, get the free white paper on achieving Zero-trust security for containerized applications here.


Observability, or "Knowing What Your Microservices Are Doing"

Microservicin’ ain’t easy, but it’s necessary. Breaking your monolith down into smaller pieces is a must in a cloud native world, but it doesn’t automatically make everything easier. Some things actually become more difficult. An obvious area where it adds complexity is communications between services; observability into service to service communications can be hard to achieve, but is critical to building an optimized and resilient architecture.

The idea of monitoring has been around for a while, but observability has become increasingly important in a cloud native landscape. Monitoring aims to give an idea of the overall health of a system, while observability aims to provide insights into the behavior of systems. Observability is about data exposure and easy access to information which is critical when you need a way to see when communications fail, do not occur as expected or occur when they shouldn’t. The way services interact with each other at runtime needs to be monitored, managed and controlled. This begins with observability and the ability to understand the behavior of your microservice architecture.

A primary microservices challenges is trying to understand how individual pieces of the overall system are interacting. A single transaction can flow through many independently deployed microservices or pods, and discovering where performance bottlenecks have occurred provides valuable information.

It depends who you ask, but many considering or implementing a service mesh say that the number one feature they are looking for is observability. There are many other features a mesh provides, but those are for another blog. Here, I’m going to cover the top observability features provided by a service mesh.

Tracing

An overwhelmingly important things to know about your microservices architecture is specifically which microservices are involved in an end-user transaction. If many teams are deploying their dozens of microservices, all independently of one another, it’s difficult to understand the dependencies across your services. Service mesh provides uniformity which means tracing is programming-language agnostic, addressing inconsistencies in a polyglot world where different teams, each with its own microservice, can be using different programming languages and frameworks.

Distributed tracing is great for debugging and understanding your application’s behavior. The key to making sense of all the tracing data is being able to correlate spans from different microservices which are related to a single client request. To achieve this, all microservices in your application should propagate tracing headers. If you’re using a service mesh like Aspen Mesh, which is built on Istio, the ingress and sidecar proxies automatically add the appropriate tracing headers and reports the spans to a tracing collector backend. Istio provides distributed tracing out of the box making it easy to integrate tracing into your system. Propagating tracing headers in an application can provide nice hierarchical traces that graph the relationship between your microservices. This makes it easy to understand what is happening when your services interact and if there are any problems.

Metrics

A service mesh can gather telemetry data from across the mesh and produce consistent metrics for every hop. Deploying your service traffic through the mesh means you automatically collect metrics that are fine-grained and provide high level application information since they are reported for every service proxy. Telemetry is automatically collected from any service pod providing network and L7 protocol metrics. Service mesh metrics provide a consistent view by generating uniform metrics throughout. You don’t have to worry about reconciling different types of metrics emitted by various runtime agents, or add arbitrary agents to gather metrics for legacy apps. It’s also no longer necessary to rely on the development process to properly instrument the application to generate metrics. The service mesh sees all the traffic, even into and out of legacy “black box” services, and generates metrics for all of it.

Valuable metrics that a service mesh gathers and standardizes include:

  • Success Rates
  • Request Volume
  • Request Duration
  • Request Size
  • Request and Error Counts
  • Latency
  • HTTP Error Codes

These metrics make it simpler to understand what is going on across your architecture and how to optimize performance.

Most failures in the microservices space occur during the interactions between services, so a view into those transactions helps teams better manage architectures to avoid failures. Observability provided by a service mesh makes it much easier to see what is happening when your services interact with each other, making it easier to build a more efficient, resilient and secure microservice architecture.


The Road Ahead for Service Mesh

This is the third in a blog series covering how we got to a service meshwhy we decided on the type of mesh we did and where we see the future of the space.

If you’re struggling to manage microservices as architectures continue to become more complex, there’s a good chance you’ve at least heard of service mesh. For the purposes of this blog, I’ll assume you’re familiar with the basic tenets of a service mesh.

We believe that service mesh is advancing microservice communication to a new level that is unachievable with the one-off solutions that were previously being used. Things like DNS provide some capabilities like service discovery, but don’t provide fast retries, load balancing, tracing and health monitoring. The old approach also requires that you cobble together several things each time when it’s possible to bundle it all together in a reusable tool.

While it’s possible to accomplish much of what a service mesh manages with individual tools and processes, it’s manual and time consuming. The images below provides a good idea of how a mesh simplifies the management of microservices.

 

 

Right Around the Corner

So what’s in the immediate future? I think we’ll see the technology quickly mature and add more capabilities as standard features in response to enterprises realizing the efficiency gains created by a mesh and look to implement them as the standard for managing microservice architectures. Offerings like Istio are not ready for production deployments, but the roadmap is progressing quickly and it seems we’ll be to v1 in short order. Security is a feature provided by service mesh, but for most enterprises it’s a major consideration and I see policy enforcement and monitoring options becoming more robust for enterprise production deployments. A feature I see on the near horizon and one that will provide tremendous value is an analytics platform to show insights from the huge amount of telemetry data in a service mesh. I think an emerging value proposition we’ll see is that the mesh allows you to gain and act on data that will allow you to more efficiently manage your entire architecture.

Further Down the Road

There is a lot of discussion on what’s on the immediate horizon for service mesh, but what is more interesting is considering what the long term will bring. My guess is that we ultimately come to a mesh being an embedded value add in a platform. Microservices are clearly the way of the future, so organizations are going to demand an effortless way to manage them. They’ll want something automated, running in the background that never has to be thought about. This is probably years down the road, but I do believe service mesh will eventually be a ubiquitous technology that is a fully managed plug and play config. It will be interesting to see new ways of using the technology to manage infrastructure, services and applications.

We’re excited to be part of the journey, and are inspired by the ideas in the Istio community and how users are leveraging service mesh to solve direct problems created by the explosion of microservices and also find new efficiencies with it. Our goal is to make the implementation of a mesh seamless with your existing technology and provide enhanced features, knowledge and support to take the burden out of managing microservices. We’re looking forward to the road ahead and would love to work with you to make your microservices journey easier.


Top 3 Reasons to Manage Microservices with Service Mesh


Building microservices is easy, operating a microservice architecture is hard. Many companies are successfully using tools like Kubernetes for deploys, but they still face runtime challenges. This is where the service mesh comes in. It greatly simplifies the managing of containerized applications and makes it easier to monitor and secure microservice-based applications. So what are the top 3 reasons to use a supported service mesh? Here’s my take.

Security

Since service mesh operates on a data plane, it’s possible to apply common security across the mesh which provides much greater security than multilayer environments like Kubernetes. A service mesh secures inter-service communications so you can know what a service is talking to and if that communication can be trusted.

Observability

Most failures in the microservices space occur during the interactions between services, so a view into those transactions helps teams better manage architectures to avoid failures. A service mesh provides a view into what is happening when your services interact with each other. The mesh also greatly improves tracing capabilities and provides the ability to add tracing without touching all of your applications.

Simplicity

A service mesh is not a new technology, rather a bundling together of several existing technologies in a package that makes managing the infrastructure layer much simpler. There are existing solutions that cover some of what a mesh does, take for example DNS. It’s a good way to do service discovery when you don’t care about the source trying to discover the service. If all you need in service discovery is to find the service and connect to it, DNS is sufficient, but it doesn’t give you fast retries or health monitoring. When you want to ask more advanced questions, you need a service mesh. You can cobble things together to address much of what a service mesh addresses, but why would you want to if you could just interact with a service mesh that provides a one-time, reusable packaging?

There are certainly many more advantages to managing microservices with a service mesh, but I think the above 3 are major selling points where organizations that are looking to scale their microservice architecture would find the greatest benefit. No doubt there will also be expanded capabilities in the future such as analytics dashboards that provide easy to consume insights from the huge amount of data in a service mesh. I’d love to hear other ideas you might have on top reasons to use service mesh, hit me up @zjory.