Introduction

The year 2019 was not a good one for data security. In the first nine months, there were 5,183 breaches reported with 7.9 billion records exposed. Compared to the mid-year of 2018, the total number of breaches was up 33.3 percent and the total number of records exposed more than doubled, up 112 percent. 

graphic comparing 2019 to 2018 records exposed and breaches

What does this tell us? That, despite significant technology investments and advancements, security is still hard. A single phishing email, missed patch, or misconfiguration can let the bad guys in to wreak havoc or steal data. For companies that are moving to the cloud and the cloud-native architecture of microservices and containerized applications, it’s even harder. Now, in addition to the perimeter and the network itself, there’s a new network infrastructure to protect: the myriad connections between microservice containers.  

With microservices, the surface area available for attack has increased exponentially, putting data at greater risk. Moreover, network-related problems like access control, load balancing, and monitoring that had to be solved once for a monolith application now must be handled separately for each service within a cluster.  

Emerging today to address security in this environment is the convergence of the Zero-Trust approach to network security and service mesh technology. A service mesh combines security and operations capabilities into a transparent infrastructure layer that sits between the containerized application and the network.  

This paper examines the tenets of Zero-Trust security and how a service mesh enables Zero Trust in the microservices environment. It also looks at how Zero- Trust capabilities can help organizations address and demonstrate compliance with stringent industry regulations. The context of our discussion is containerized applications that are managed in Kubernetes clusters. The capabilities we discuss are provided by the Aspen Mesh distribution of the open-source Istio service mesh. 

Zero-Trust Security Today

Traditionally, network security has been based on having a strong perimeter to help thwart attackers, commonly known as the moat-and-castle approach. With a secure perimeter constructed of firewalls, you trust the internal network by default, and by extension, anyone who’s there already. Unfortunately, this was never a reliably effective strategy. But more importantly, this approach is becoming even less effective in a world where employees expect access to applications and data from anywhere in the world, on any device. In fact, other types of threats — such as insider threats — have generally been considered by most security professionals to be among the highest threats to data protected by companies, leading to more development around new ways to address these challenges.  

In 2010, Forrester Research coined the term “Zero Trust” and overturned the perimeter-based security model with a new principle: “never trust, always verify.” That means no individual or machine is trusted by default from inside or outside the network. Another Zero-Trust precept: “assume you’ve been compromised but may not yet be aware of it.” With the time to identify and contain a breach running at 279 days in 2019,2 that’s not an unsafe assumption.

isometric castle graphic

Zero Trust Networking Principles

  • Networks should always be considered hostile Just because you’re inside the “castle” does not make you safe. 
  • Network locality is not sufficient for deciding trust in a network Just because you know someone next to you in the “castle” doesn’t mean you should trust them.
  • Every device, user, and request should be authenticated and authorized. Ensure that everyone and everything entering the “castle” has been properly identified and is allowed to enter.
  • Network policies must be dynamic and calculated from as many sources of data as possible. Ask as many people as possible when validating if someone can enter the “castle”. 

Starting in 2013, Google began its transition to implementing Zero Trust into its networking infrastructure with much success and has made the results of their efforts open to the public in BeyondCorp. Fast forward to 2019 and the plans to adopt this new paradigm have spread across industries like wildfire, largely in response to massive data breaches and stricter regulatory requirements. 

Mitigating Cyberattacks Against Containerized Applications with Zero-Trust Networking 

Here are some examples of attacks that a service mesh can help mitigate: 

  • Service impersonation. A bad actor gains access to the private network for your applications, pretends to be an authorized service, and starts making requests for sensitive data.
  • Unauthorized access. A legitimate service makes requests for sensitive data that it is not authorized to obtain.
  • Packet sniffing. A bad actor gains access to your applications private network and captures sensitive data from legitimate requests going over the network.
  • Data exfiltration. A bad actor sends sensitive data out of the protected network to a destination of their choosing.

Security Within the Kubernetes Cluster

While there are myriad Zero-Trust networking solutions available for protecting the perimeter and the operation of corporate networks, there are many new miles of connections within the microservices environment that also need protection. A service mesh provides critical security capabilities such as observability to aid in optimizing MTTD and MTTR, as well as ways to implement and manage encryption, authentication, authorization, policy control and configuration in Kubernetes clusters.  

Simplifying Microservices Security with Incremental mTLS
Kubernetes removes much of the complexity and difficulty involved in managing and operating a microservices application architecture. Kubernetes also sets up basic networking capabilities. However, most of the networking capabilities provided by Kubernetes are constrained to the lower levels of the networking stack. 

That means providing advanced networking functionality, including transport layer security (TLS) encryption, has to be baked into the application. Burdening the application (and your developers) with enabling TLS encryption for all inbound and outbound traffic within a Kubernetes environment is complex. It involves establishing trust, managing certificates, verifying trust, and processing encryption/ decryption — none of which are associated with the application function. 

This is the problem that a service mesh leveraging the sidecar approach solves by offloading network functions from the microservice. The sidecar approach puts data path functionality into a separate container and then situates that container as close to the application it is protecting as possible. In Kubernetes, the sidecar container and the application container live in the same Kubernetes pod, so the communication path between sidecar and app is protected inside the pod’s network namespace; by default, it isn’t visible to the host or other network namespaces on the system. 

The sidecar can initiate mutual TLS (mTLS), encrypt service-to-service traffic, and achieve non-repudiation for requests without requiring any changes or support from the applications. This layer of security reduces the likelihood of a successful man-in-the-middle (MiTM) attack by requiring all parties in a request to have valid certificates that trust each other.  

Istio provides a control plane with a rich set of tools for configuring mTLS globally (on or off) for the entire cluster or incrementally, enabling mTLS for a subset of services for organizations operating in a hybrid environment.  

Managing Identity, Certificates, and Authorization in Service Mesh
One of the key Zero-Trust security principles requires that “every device, user, and request is authenticated and authorized.” The service mesh addresses this principle by issuing secure identities to services, much like how application users are issued an identity. This is often referred to as the SVID (Secure and Verifiable Identification) and is used to identify the services across the mesh, so they can be authenticated and authorized to perform actions. In addition to handling workload identities, the service mesh creates and renews certificates and mounts the appropriate certificates to the sidecars.  

The Istio control plane centralizes policy management and enforces networking rules at runtime. However, this is where things get a little complicated. Correctly configuring mTLS for one service, for example, may require configuring an authentication policy for that service and the corresponding clients.  

The authentication policy follows a complex set of precedence rules which must be accounted for when creating these configuration objects. For example, a namespace-level authentication policy overrides the mesh-level global policy, and a service-level policy overrides the namespace level. Moreover, a service port-level policy overrides the service-specific authentication policy. 

Access Control and Enforcing the Level of Least Privilege

In addition to applying a Zero-Trust approach to the network connections within the Kubernetes cluster, the service mesh adds controls over traffic ingress and egress at the perimeter. Allowed user behavior is addressed with with role-based access control (RBAC). With these controls, the Zero-Trust philosophy of “trust no one, authenticate everyone” stays in force by providing enforceable least privilege access to services in the mesh.  

Ingress Control  

It’s important to note that routing traffic within a service mesh and allowing external traffic into the mesh function differently. Within the mesh, policies specify exceptions from normal traffic since Istio by default (in compatibility with Kubernetes) allows everything to talk to everything once inside the mesh.  

Getting traffic into the service mesh works in reverse (similar to traditional load balancers and application delivery controllers). That means specifying exactly what traffic is allowed in so that your services can safely connect with APIs and databases both within the organization and on the internet. With traditional load balancers, virtual IPs and virtual servers have long been used as concepts that enable operators to configure ingress traffic in a flexible and scalable manner. 

Similarly, Istio gateways control exposure of services at the edge, enabling monitoring and employing routing rules to address traffic as it enters the mesh. This works much in the same way that tying virtual IPs to virtual servers works with traditional load balancers. Gateways also leverage the built-in capabilities of Kubernetes as an ingress controller to add security with ingress rules such as whitelisting and blacklisting. 

Egress Control  

Egress is also a key security concern. It’s important to be cautious about what data is allowed to leave a cluster because most security breaches include some type of egress exploit — typically by data exfiltration. This can be carried out by malware executing a command to extract data and transmit it to an unauthorized IP address or by an unauthorized person who intentionally or unwittingly extracts the data and shares it with an unauthorized third party or moves it to an insecure system. Both types of exploit are hard to detect because the data flow looks like business-as-usual network traffic. 

The service mesh enables control over how traffic is routed from services in the mesh to external services. The native Istio default is to allow the sidecar proxy to pass through all requests to services not configured within the mesh, and it does enable egress traffic controls, including whitelisting and blacklisting. For example, individual services can be configured to control access to external services or, alternatively, to bypass the sidecar for a specific range of IP addresses — but again, this can get complicated as the Kubernetes environment and the service mesh grow. Other Istio egress capabilities include providing gateways for traffic control, managing TLS origination, and supporting Kubernetes-native egress services. 

Role-Based Access Control

Because even the most secure system can be easily circumvented by over-privileged users, it’s important to use a proven strategy for access control. In systems security, role-based access control, or role-based security, provides the ability to enforce the principle of least privilege in an organization. As an advanced access control, it restricts network access based on individuals’ roles within an organization. For enhanced security, different access levels are granted to different authorized users within a network based on what they need to do to perform job responsibilities. 

As a key security element for Kubernetes clusters and service meshes, RBAC provides important features such as:  

  • Delivering more consistent access management 
  • Providing enforceable least privilege 
  • Enabling an authentication mechanism for users with different roles 
  • Restricting user or user group operations 
  • Restricting operations performed by processes inside pods 
  • Controlling resource visibility 
  • Maximizing operational efficiency 
  • Reducing HR and administrative work and IT support 

Kubernetes RBAC enables control over how unique, authorized user or user group permission levels are defined in a Kubernetes cluster. The service mesh extends those capabilities, enabling fine-grained control. 

While Kubernetes RBAC can help you meet compliance needs, additional benefits can be gained from a service mesh in order to achieve more fine-grained RBAC. However, for this to work as intended, the service mesh must be configured correctly. 

Aspen Mesh provides two tools that address the complexity of access control and enforcing the level of least privilege in order to help you achieve a Zero-Trust security posture. Istio Vet is designed to prevent misconfigurations in the service mesh by refusing to allow them in the first place. In addition, Istio Vet warns about incorrect or incomplete service mesh configuration and it also provides issues resolution guidance for any issues it finds. 

Organizations using global Istio configuration resources can take advantage of the Aspen Mesh-developed tool, Traffic Claim Enforcer. Global configuration resources can affect how traffic flows through the service mesh to any specified target, which requires accurate configuration of resource namespaces to make sure traffic can get to intended destinations. Namespace misconfigurations are difficult to troubleshoot. Traffic Claim Enforcer works with Kubernetes RBAC to help avoid invalid configurations and to provide an early failure for configuration problems for easier, faster detection. Traffic Claim Enforcer can be invoked globally or on a namespace-by-namespace basis. 

service mesh for rbac graphic

Monitoring, Alerting, and Observability

Monitoring and alerting are key components to successfully meeting security requirements and demonstrating industry compliance. A service mesh takes system monitoring a step further by providing observability. Monitoring reports overall system health, while observability focuses on highly granular insights into the behavior of systems, in particular via consistent, in-depth tracing between all services within the mesh.

For example, an overwhelmingly important thing to know for security and regulatory compliance is which microservices are involved in an end-user transaction. With many teams deploying dozens of microservices independently, it can be difficult to understand the dependencies across services. Distributed tracing, made possible by the service mesh, addresses this by automatically adding tracing headers to transactions and then reporting the spans to a tracing collector.

Just as important is the ability to create criteria-driven alerts. For example, during the development process, the service mesh can create warnings of potential misconfigurations that would affect security as well as connectivity and performance. At runtime, security alerts are issued when unhealthy communication is observed, allowing rapid response and troubleshooting.

Keeping records of how data flows through the Kubernetes cluster is one of the key requirements of compliance as well as a security best practice. Some service meshes can provide a view of service-to-service communications and can retrieve historical configurations detailing what services were communicating, their corresponding security configuration (e.g. whether or not mutual TLS was enabled, certificate thumbprint and validity period, internal IP address, and protocol) and what the entire cluster configuration was at that point in time (minus Kubernetes secrets, of course).

Achieving Compliance in Highly Regulated Industries

To combat the increase in high-profile cyberattacks, regulations and standards are evolving to include stricter controls. The aim is to enforce security best practices when organizations process, store, and transmit sensitive data. This includes Payment Card Industry Data Security Standards (PCI DSS) and the EU’s General Data Protection Regulation (GDPR) for personally identifiable information (PII) and the Health Insurance Portability and Accountability Act (HIPAA) for electronic protected health information (ePHII).

In this paper, we have covered how employing a service mesh to achieve Zero-Trust security in a Kubernetes environment addresses authentication, authorization and accounting. Transport encryption via mTLS in the service mesh addresses the data integrity requirement. Moreover, the service mesh removes the burden of addressing these security requirements from the development team, allowing them to focus on functions that provide direct value to customers.

The most common technical requirements across regulations and standards are:

  • Authentication – Verify the identity of the actor seeking access to protected data.
  • Authorization – Verify the actor is allowed to access the requested protected data.
  • Accounting – Provide mechanisms for recording and examining activities within the system.
  • Data Integrity – Protect data from being altered or destroyed in an unauthorized manner.

Why Aspen Mesh

Aspen Mesh can help you to achieve a Zero-Trust security posture by applying the concepts and features discussed in this paper. Aspen Mesh is an enterprise- and production-ready service mesh that extends the capabilities of Istio to address enterprise security and compliance needs. It also provides an intuitive hosted user interface and dashboard that make it easier to deploy, monitor, and configure these features. Aspen Mesh includes:

Easy mTLS. – The dashboard makes it easy for users to identify services and workloads which have mTLS turned on or off and then easily create a configuration to change the mTLS state as needed. This allows services on the mesh to be consumed by clients outside the Kubernetes environment.

Enhanced ingress. – With Secure Ingress from Aspen Mesh, operators define a secure ingress object and developers define an application object. Aspen Mesh takes care of the rest, creating the objects that Istio expects.

Enhanced egress. – Aspen Mesh enables observing what egress points are in use, how frequently they are accessed, and how healthy they are. It also surfaces idle egress policies that can be turned off.

Enhanced RBAC. – Istio Vet helps prevent RBAC misconfigurations in the service mesh while Traffic Claim Enforcer helps avoid invalid traffic configurations. (See the RBAC section above for details.)

Secure by default configuration. – Aspen Mesh implements a secure by default posture by setting communication and security switches to “on.” For example, while Istio enables mTLS encryption for all services in a cluster, Aspen Mesh turns it on by default. Likewise, security features like egress control and protocol sniffing are on by default.

Advanced policy and configuration options. – Aspen Mesh includes a policy framework that simplifies specifying, measuring, and enforcing security policies, along with alerts that identify configuration errors.

Take the Next Step

At Aspen Mesh, we’re here to help you achieve a Zero-Trust security posture for containerized applications at your organization.

Reach out to us to schedule a time to speak with one of our experts, or refer to the resources below to learn more on your own:

Microservice Security and Compliance in Highly Regulated Industries: Threat Modeling

Securing Containerized Applications With Service Mesh

Microservice Security and Compliance in Highly Regulated Industries: Zero Trust Security

References

  1. “Data Breach QuickView Report: 2019 Q3 trends,” Risk Based Security, November 2019. https://www.riskbasedsecurity.com/2019/11/12/number-of-records-exposed-up-112/
  2. Cost of a Data Breach Report 2019,” Ponemon Institute and IBM Security, 2019. https://www.ibm.com/security/data-breach
  3. https://cloud.google.com/beyondcorp/