How Delphi Simplifies Kubernetes Security with Aspen Mesh

Customer Story: How Delphi Simplifies Kubernetes Security with Aspen Mesh

Delphi and Zero-Trust Security

Delphi delivers software solutions that help professional liability insurers streamline their operations and optimize their business processes. Operating in the highly regulated healthcare industry, privacy and compliance concerns such as HIPAA and APRA mandate a highly secure environment. As such, a Zero-trust environment is of utmost importance for Delphi and their customers. 

The infrastructure team at Delphi has fully embraced a cloud-native stack to deliver the Delphi Digital Platform to its customers. The team leverages Kubernetes to effectively manage builds and deploys. Delphi planned to use Kubernetes from the start, but was looking for a simpler security solution for their infrastructure that could be managed without implementations in each service. 

While Delphi was getting tremendous value from Kubernetes, they needed to find an easier way to bake security into the infrastructure. Taking advantage of a service mesh was the obvious solution to address this challenge, as it provides cluster-wide mTLS encryption. 

The team chose Istio to confront this problem, and while the initial solution included setting up a certificate at the load balancer, this had open http between the load balancer and service. Unfortunately, this was not acceptable in a highly regulated healthcare industry with strict requirements to keep personal data secure. 

Achieving Security with a Service Mesh

To solve these challenges, Delphi engaged with Aspen Mesh in order to implement an end-to-end encrypted solution, from Client to back end SaaS applications. This was achieved by enabling mTLS mesh-wide from service to service and creating custom Istio policy manifests to integrate cert-manager and Let's Encrypt for client-side encryption. As a result, Delphi is able to provide secure ingress integration for a multitenant B2C environment, allowing Delphi to deploy a fully scalable solution. 

[Read the Full Case Study Here]

This Aspen Mesh solution lets Delphi use Let’s Encrypt seamlessly with Istio, removing the need to consider building security into application development and placing it into an infrastructure solution that is highly scalable. Leveraging the power of Kubernetes, Istio and Aspen Mesh, the Delphi team is delivering a highly secure platform to their customers without the need to implement encryption in each service. 

“At this point, I look at Aspen Mesh as an extension of my team” 

- Bill Reeder, Delphi Technology Lead Architect


How to Approach Zero-Trust Security with a Service Mesh

How to Approach Zero-Trust Security with a Service Mesh

Last year was challenging for data security. In the first nine months alone, there were 5,183 breaches reported with 7.9 billion records exposed. Compared to mid-year 2018, the total number of breaches was up 33.3 percent and the total number of records exposed more than doubled, up 112 percent.

Zero Trust Security 2019

What does this tell us? That, despite significant technology investments and advancements, security is still hard. A single phishing email, missed patch, or misconfiguration can let the bad guys in to wreak havoc or steal data. For companies moving to the cloud and the cloud-native architecture of microservices and containerized applications, it’s even harder. Now, in addition to the perimeter and the network itself, there’s a new network infrastructure to protect: the myriad connections between microservice containers.

With microservices, the surface area available for attack has increased exponentially, putting data at greater risk. Moreover, network-related problems like access control, load balancing, and monitoring that had to be solved once for a monolith application now must be handled separately for each service within a cluster.

Zero-Trust Security and Service Mesh

Security is the most critical part of your application to implement correctly. A service mesh allows you to handle security in a more efficient way by combining security and operations capabilities into a transparent infrastructure layer that sits between the containerized application and the network. Emerging today to address security in this environment is the convergence of the Zero-Trust approach to network security and service mesh technology.

Here are some examples of attacks that a service mesh can help mitigate:

  • Service impersonation
    • A bad actor gains access to the private network for your applications, pretends to be an authorized service, and starts making requests for sensitive data.
  • Unauthorized access
    • A legitimate service makes requests for sensitive data that it is not authorized to obtain.
  • Packet sniffing
    • A bad actor gains access to your applications private network and captures sensitive data from legitimate requests going over the network.
  • Data exfiltration
    • A bad actor sends sensitive data out of the protected network to a destination of their choosing.

So how can the tenets of Zero-Trust security and how a service mesh enable Zero Trust in the microservices environment? And how can Zero-Trust capabilities help organizations address and demonstrate compliance with stringent industry regulations?

Threats and Securing Microservices

Moat and Castle ApproachTraditionally, network security has been based on having a strong perimeter to help thwart attackers, commonly known as the moat-and-castle approach. With a secure perimeter constructed of firewalls, you trust the internal network by default, and by extension, anyone who’s there already. Unfortunately, this was never a reliably effective strategy. But more importantly, this approach is becoming even less effective in a world where employees expect access to applications and data from anywhere in the world, on any device. In fact, other types of threats -- such as insider threats -- have generally been considered by most security professionals to be among the highest threats to data protected by companies, leading to more development around new ways to address these challenges.

In 2010, Forrester Research coined the term “Zero Trust” and overturned the perimeter-based security model with a new principle: “never trust, always verify.” That means no individual or machine is trusted by default from inside or outside the network. Another Zero-Trust precept: “assume you’ve been compromised but may not yet be aware of it.” With the time to identify and contain a breach running at 279 days in 2019, that’s not an unsafe assumption.

Starting in 2013, Google began its transition to implementing Zero Trust into its networking infrastructure with much success and has made the results of their efforts open to the public in BeyondCorp. Fast forward to 2019 and the plans to adopt this new paradigm have spread across industries like wildfire, largely in response to massive data breaches and stricter regulatory requirements.

While there are myriad Zero-Trust networking solutions available for protecting the perimeter and the operation of corporate networks, there are many new miles of connections within the microservices environment that also need protection. A service mesh provides critical security capabilities such as observability to aid in optimizing MTTD and MTTR, as well as ways to implement and manage encryption, authentication, authorization, policy control and configuration in Kubernetes clusters.

Security Within the Kubernetes Cluster

While there are myriad Zero-Trust networking solutions available for protecting
the perimeter and the operation of corporate networks, there are many new miles of connections within the microservices environment that also need protection. A service mesh provides critical security capabilities such as observability to aid in optimizing MTTD and MTTR, as well as ways to implement and manage encryption, authentication, authorization, policy control and configuration in Kubernetes clusters.

Here are a few ways to approach enhancing your security with a service mesh:

  • Simplify microservices security with incremental mTLS
  • Manage identity, certificates and authorization
  • Access control and enforcing the level of least privilege
  • Monitoring, alerting and observability

A service mesh also adds controls over traffic ingress and egress at the perimeter. Allowed user behavior is addressed with with role-based access control (RBAC). With these controls, the Zero-Trust philosophy of “trust no one, authenticate everyone” stays in force by providing enforceable least privilege access to services in the mesh.

Aspen Mesh can help you to achieve a Zero-Trust security posture by applying these concepts and features. As an enterprise- and production-ready service mesh that extends the capabilities of Istio to address enterprise security and compliance needs, we also provide an intuitive hosted user interface and dashboard that make it easier to deploy, monitor, and configure these features.

Learn More About Zero-Trust Security and Service Mesh

Interested in learning more about how service mesh can help you achieve Zero-Trust security? Get the free white paper by completing the form below.




Secure Ingress + TLS Termination - Aspen Mesh

Secure Ingress + TLS Termination: The Match You Didn't Know You Needed

Keeping Data Secure

Secure Ingress + TLS TerminationWorking in an increasingly connected world, maintaining the security of users and their data is a challenging problem to solve. As we move into a state where service meshes are becoming a better way to provide availability of services to your users, it’s also more important to ensure that the data is secured from the moment it leaves the user’s device until it is ingressed into your service mesh. In an interconnected world increasingly focused on the security of both users and their data, secure ingress is a fundamentally complex but necessary part of the service mesh architecture.

Today, I’d like to talk about secure ingress, TLS termination and how this all affects users and maintainers of an Istio service mesh. We’ll also touch on some of the pitfalls of setting up TLS termination as well as how Aspen Mesh attempts to simplify the entire process, so your teams are free to focus on providing value to your customers.

 

Security and You: What is Secure Ingress?

So you’ve got your services up and running, sidecars injected, Prometheus churning out stats and Istio working properly in your cluster; you’re ready to start processing user data and adding value. But your security team is leery about that TCP ingress you’ve been using for testing. Gateways and Virtual Services are great, and while passing TLS requirements down to the service has been working, they want a stronger assertion. How can you make sure that all datawhether or not it is secured within the service meshis encrypted with no setup from the services themselves?

Enter secure ingress. By configuring TLS requirements on your Istio Gateway, you can make sure that all information is encrypted, even without TLS on your services. Istio supports TLS certificates in both traditional file mount setups as well as in Kubernetes secrets

 

Configuration Issues (Otherwise Known As: ANOTHER 503 UF UR?!)

You’ve decided that TLS termination is the way to go, sounds amazing! No more needing to worry about services handling TLS requirements, and we can even JWT secure our endpoints! You deploy the gateway, virtual service and authorization policy and once you’ve got your TLS certs deployed, you hit that endpoint and get… A 503 status? Looking at the Istio ingress gateway logs only tells you that there was an upstream connection failure (UF) and the upstream connection reset (UR). What’s going on?

Welcome to layer seven TCP routing and mTLS requirements. By deploying an authorization policy to JWT to secure your ingress endpoints, you may have inadvertently disabled mTLS, causing your sidecars to balk at communicating. Maybe you haven’t even been fully using sidecars for some of the services up until this point! Each of these errors show up as an obtuse 503 status code and require digging into the istioctl proxy config just to get an understanding on what’s going on in the backend. 

Even more frustrating, the configuration of your service may have been fine without TLS termination. Ingressing through a TCP port to a TCP port on your service for HTTP traffic is fine. But now that you’re using HTTPS, Istio wants to know about what type of traffic you’ll actually be sending. You’ll have to prefix your port names on your service with “http-” in order to tell Istio what you actually are sending over to that service. But Istio’s errors aren’t going to tell you about that.

 

Certificates? DNS?

Let’s say you’ve addressed the issues above. You’ve finally gotten your services to connect from the outside world, your cluster is up and running, and everything finally seems to be working as you expected. But hold on, your ELB just restarted, and now it has a new domain name and a new IP address. Suddenly, traffic is no longer ingressing. Your old DNS records aren’t pointing to the correct host name, and since you’re using TLS now, you can’t just point your customers at your new host name and call it good. 

External DNS management will need to be configured for your cluster to make sure this does not happen, updating the records as your DNS and IP addresses shift. And if your certificates are not renewed, will you be ready with CertManager set up properly in your cluster?

 

Secure Ingress with Aspen Mesh

As described above, setting up secure ingress into an Istio cluster is not as simple as it looks. From configuration issues ranging from mTLS configuration and service port naming, to ever changing environments with DNS and certificate renewals, managing a TLS ingress into your cluster can be a daunting process, especially for new Istio users. The good news? Aspen Mesh can help simplify your secure ingress needs. By simply specifying applications and ports that you’d like to open to the world, and the ports you’d like to ingress on, Aspen Mesh’s Secure Ingress will take care of the configuration for you by: 

  • Setting up and maintaining gateways, virtual services, and authorization policies
  • Providing you with detailed information about possible misconfigurations in your secure ingress

Now doesn’t that seem like a lot of work you’d rather have your system do for you? Reach out to our team of experts if you’d like to learn more about how we can help.


Microservice Security and Compliance in Highly Regulated Industries: Threat Modeling

The year is 2019, and the number of reported data breaches is up 54% compared to midyear 2018 and is set to be the “worst year on record,’ according to RiskBased Security research. Nearly 31 million records have been exposed in the 13 most significant data breaches of the first half of this year. Exposed documents included personal health information (PHI), personally identifiable information (PII) and financial data. Most of these data breaches were caused by one common flaw: poor technical and human controls that could have easily been mitigated if an essential security process were followed. This simple and essential security process is known as threat modeling.

What is threat modeling?

Threat modeling is the process of identifying and communicating potential risks and threats, then creating countermeasures to respond to those threats. Threat modeling can be applied to multiple areas such as software, systems, networks, and business processes. When threat modeling, you must ask and answer questions about the systems you are working to protect. 

Per OWASP, threat model methodologies answer one or more of the following questions: 

  • What are we building?
    • Outputs:
      • Architecture diagrams
      • Dataflow transitions
      • Data classifications
  • What can go wrong?
    • To best answer this question, organizations typically brainstorm or use structures such as STRIDE, CAPEC or Kill Chains to help determine primary threats that apply to your systems and organization. 
    • Outputs:
      • A list of the main threats that apply to your system.
  • What are we going to do about that?
    • Output
      • Actionable tasks to address your findings.
  • Did we do an acceptable job?
    • Review the quality, feasibility, process, and planning of the work you have done.

These questions require that you step out of your day-to-day responsibilities and holistically consider systems and processes surrounding them. When done right, threat modeling provides a clear view of the project requirements and helps justify security efforts in language everyone in the organization can understand.

Who should be included in threat modeling?

The short answer is, everyone. Threat modeling should not be conducted in a silo by just the security team but should be worked on by a diverse group made up of representatives across the organization. Representatives should include application owners, administrators, architects, developers, product team members, security engineers, data engineers, and even users. Everyone should come together to ask questions, flag concerns and discuss solutions.

A security checklist is essential

In addition to asking and answering general system and process questions, a security checklist should be used for facilitating these discussions. Without a defined and agreed-upon list, your team may overlook critical security controls and won’t be able to evaluate and continually improve standards.

Here’s a simple example of a security checklist:

Authentication and Authorization

☐ Are actors required to authenticate so that there is a guarantee of non-repudiation?

☐ Do all operations in the system require authorization?

Access Control

☐ Is access granted in a role-based fashion?

☐ Are all access decisions relevant at the time the request is performed?

Trust Boundaries

☐ Can you clearly identify where the levels of trust change in your model?

☐ Can you map those to authentication, authorization and access control?

Accounting and Auditing

☐ Are all operations being logged?

☐ Can you guarantee there is no PII, ePHI or secrets being logged?

☐ Are all audit logs adequately tagged?  

When should I start threat modeling? 

“The sooner the better, but never too late.” - OWASP

How often should threat modeling occur?

Threat modeling should occur during system design, and anytime systems or processes change. Ideally, threat modeling is tightly integrated into your development methodology and is performed for all new features and modifications prior to those changes being implemented. By tightly integrating with your development process, you can catch and address issues early in the development lifecycle before they’re expensive and time-consuming to resolve.

Threat modeling: critical for a secure and compliant microservice environment

Securing distributed microservice systems is difficult. The attack surface is substantially larger than an equivalent single system architecture and is often much more difficult to fully comprehend all of the ways data flows through the system. Given that microservices can be short-lived and replaced on a moment's notice, the complexity can quickly compound. This is why it is critical that threat modeling is tightly integrated into your development process as early as possible.     

Aspen Mesh makes it easier to implement security controls determined during threat modeling

Threat modeling is only one step in a series of steps required to secure your systems. Thankfully, Aspen Mesh makes it trivial to implement security and compliance controls with little to no custom development required, thus allowing you to achieve your security and compliance goals with ease. If you would like to discuss the most effective way for your organization to secure their microservice environments, grab some time to talk through your use case and how Aspen Mesh can help solve your security concerns.

Learn more about security and service mesh

Interested in learning more about how service mesh can help you achieve security? Get the free white paper on achieving Zero-trust for containerized applications by completing the form below.


Microservice Security and Compliance in Highly Regulated Industries: Zero Trust Security

Zero Trust Security

Security is the most critical part of your application to implement correctly. Failing to secure your users’ data can be very expensive and can make customers lose their faith in your ability to protect their sensitive data. A recent IBM-sponsored study showed that the average cost of a data breach is $3.92 million, with healthcare being the most expensive industry with an average of $6.45 million per breach. What else might be surprising is that the average time to identify and contain a breach is 279 days, while the average lifecycle of a malicious attack from breach to containment is 314 days. 

Traditionally network security has been based on having a strong perimeter to help thwart attackers, commonly known as the moat-and-castle approach. This approach is no longer effective in a world where employees expect access to applications and data from anywhere in the world, on any device. This shift is forcing organizations to evolve at a rapid rate to stay competitive in the market, and has left many engineering teams scrambling to keep up with employee expectations. Often this means rearchitecting systems and services to meet these expectations, which is often difficult, time consuming, and error prone.   

In 2010 Forester coined the term ‘Zero Trust’ where they flipped the current security models on their heads by changing how we think about cyberthreats. The new model is to assume you’ve been compromised, but may not yet be aware of it. A couple years later, Google announced they had implemented Zero Trust into their networking infrastructure with much success. Fast forward to 2019 and the plans to adopt this new paradigm have spread across industries like wildfire, mostly due to the massive data breaches and stricter regulatory requirements.

Here are the key Zero Trust Networking Principles:

  • Networks should always be considered hostile. 
    • Just because you’re inside the “castle” does not make you safe.
  • Network locality is not sufficient for deciding trust in a network.
    • Just because you know someone next to you in the “castle”, doesn’t mean you should trust them.
  • Every device, user, and request is authenticated and authorized.
    • Ensure that every person entering the “castle” has been properly identified and is allowed to enter.
  • Network policies must be dynamic and calculated from as many sources of data as possible. 
    • Ask as many people as possible when validating if someone is allowed to enter the “castle”.

Transitioning to Zero Trust Networking can dramatically increase your security posture, but until recent years, it has been a time consuming and difficult task that required extensive security knowledge within engineering teams, sophisticated internal tooling that could manage workload certificates, and service level authentication and authorization. Thankfully service mesh technologies, such as Istio, allow us to easily implement Zero Trust Networking across our microservices and clusters with little effort, minimal service disruption, and does not require your team to be security experts. 

Zero Trust Networking With Istio

Istio provides the following features that help us implement Zero Trust Networking in our infrastructure:

  • Service Identities
  • Mutual Transport Layer Security (mTLS)
  • Role Based Access Control (RBAC) 
  • Network Policy

Service Identities

One of the key Zero Trust Networking principles requires that “every device, user, and request is authenticated and authorized”. Istio implements this key foundational principle by issuing secure identities to services, much like how application users are issued an identity. This is often referred to as the SVID (Secure and Verifiable Identification) and is used to identify the services across the mesh, so they can be authenticated and authorized to perform actions. Service identities can take different forms based on the platform Istio is deployed on, for example:

  • When deployed on:
    • Kubernetes: Istio can use Kubernetes service accounts.
    • Amazon Web Services (AWS): Istio can use AWS IAM user and role accounts.
    • Google Kubernetes Engine (GKE): Istio can use Google Cloud Platform (GCP) service accounts.

Mutual Transport Layer Security (mTLS)

To support secure Service Identities and to secure data in transit, Istio provides mTLS for encrypting service-to-service communication and achieving non-repudiation for requests. This layer of security reduces the likelihood of a successful Man-in-The-Middle attack (MiTM) by requiring all parties in a request to have valid certificates that trust each other. The process for certificate generation, distribution, and rotation is automatically handled by a secure Istio service called Citadel.  

Role Based Access Control (RBAC)

Authorization is a critical part of any secure system and is required for a successful Zero Trust Networking implementation. Istio provides flexible and highly performant RBAC via centralized policy management, so you can easily define what services are allowed to communicate and what endpoints services and users are allowed to communicate with. This makes the implementation of the principle of least privilege (PoLP) simple and reduces the development teams’ burden of creating and maintaining authorization specific code.

Network Policy

With Istio’s centralized policy management, you can enforce networking rules at runtime. Common examples include, but are not limited to the following:

  • Allowlisting and denylisting access to services, so that access is only granted to certain actors.
  • Rate limiting traffic, to ensure a bad actor does not cause a Denial of Service attack.
  • Redirecting requests, to enforce that certain actors go through proper channels when making their requests.

Cyber Attacks Mitigated by Zero Trust Networking With Istio

The following are example attacks that can be mitigated:

  1. Service Impersonation - A bad actor is able to gain access to the private network for your applications, pretends to be an authorized service, and starts making requests for sensitive data.
  2. Unauthorized Access - A legitimate service makes requests for sensitive data that it is not authorized to obtain. 
  3. Packet Sniffing - A bad actor gains access to your applications private network and captures sensitive data from legitimate requests going over the network.
  4. Data Exfiltration - A bad actor sends sensitive data out of the protected network to a destination of their choosing.

Applying Zero Trust Networking in Highly Regulated Industries

To combat increased high profile cyber attacks, regulations and standards are evolving to include stricter controls to enforce that organizations follow best practices when processing and storing sensitive data. 

The most common technical requirements across regulations and standards are:

  • Authentication - verify the identity of the actor seeking access to protected data.
  • Authorization - verify the actor is allowed to access the requested protected data.
  • Accounting - mechanisms for recording and examining activities within the system.
  • Data Integrity - protecting data from being altered or destroyed in an unauthorized manner.

As you may have noticed, applying Zero Trust Networking within your application infrastructure does not only increase your security posture and help mitigate cyber attacks, it also addresses control requirements set forth in regulations and standards, such as HIPAA, PCI-DSS, GDPR, and FISMA.

Use Istio to Achieve Zero Trust the Easy Way

High profile data breaches are at an all time high, cost an average of $3.92 million, and they take upwards of 314 days from breach to containment. Implementing Zero Trust Networking with Istio to secure your microservice architecture at scale is simple, requires little effort, and can be completed with minimal service disruption. If you would like to discuss the most effective way for your organization to achieve zero trust, grab some time to talk through your use case and how Aspen Mesh can help solve your security concerns.

Learn More About Security and Service Mesh

Interested in learning more about how service mesh can help you achieve Zero Trust security? Get the free white paper by completing the form below.


Simplifying Microservices Security with Incremental mTLS

Kubernetes removes much of the complexity and difficulty involved in managing and operating a microservices application architecture. Out of the box, Kubernetes gives you advanced application lifecycle management techniques like rolling upgrades, resiliency via pod replication, auto-scalers and disruption budgets, efficient resource utilization with advanced scheduling strategies and health checks like readiness and liveness probes. Kubernetes also sets up basic networking capabilities which allow you to easily discover new services getting added to your cluster (via DNS) and enables pod to pod communication with basic load balancing.

However, most of the networking capabilities provided by Kubernetes and it’s CNI providers are constrained to layer 3/4 (networking/protocols like TCP/IP) of the OSI stack. This means that any advanced networking functionality (like retries or routing) which relies on higher layers i.e. parsing application protocols like HTTP/gRPC (layer 7) or encrypting traffic between pods using TLS (layer 5) has to be baked into the application. Relying on your applications to enforce network security is often fraught with landmines related to close coupling of your operations/security and development teams and at the same time adding more burden on your application developers to own complicated infrastructure code.

Let’s explore what it takes for applications to perform TLS encryption for all inbound and outbound traffic in a Kubernetes environment. In order to achieve TLS encryption, you need to establish trust between the parties involved in communication. For establishing trust, you need to create and maintain some sort of PKI infrastructure which can generate certificates, revoke them and periodically refresh them. As an operator, you now need a mechanism to provide these certificates (maybe use Kubernetes secrets?) to the running pods and update the pods when new certificates are minted. On the application side, you have to rely on OpenSSL (or its derivatives) to verify trust and encrypt traffic. The application developer team needs to handle upgrading these libraries when CVE fixes and upgrades are released. In addition to all these complexities, compliance concerns may also require you only support a TLS version (or higher) and subset of ciphers, which requires creating and supporting more configuration options in your applications. All of these challenges make it very hard for organizations to encrypt all pod network traffic on Kubernetes, whether it’s for compliance reasons or achieving a zero trust network model.

This is the problem that a service mesh leveraging the sidecar proxy approach is designed to solve. The sidecar proxy can initiate a TLS handshake and encrypt traffic without requiring any changes or support from the applications. In this architecture, the application pod makes a request in plain text to another application running in the Kubernetes cluster which the sidecar proxy takes over and transparently upgrades to use mutual TLS. Additionally, the Istio control plane component Citadel handles creating workload identities using the SPIFFE specification to create and renew certificates and mount the appropriate certificates to the sidecars. This removes the burden of encrypting traffic from developers and operators.

Istio provides a rich set of tools to configure mutual TLS globally (on or off) for the entire cluster or incrementally enabling mTLS for namespaces or a subset of services and its clients and incrementally adopting mTLS. This is where things get a little complicated. In order to correctly configure mTLS for one service, you need to configure an Authentication Policy for that service and the corresponding DestinationRules for its clients.

Both the Authentication policy and Destination rule follow a complex set of precedence rules which must be accounted for when creating these configuration objects. For example, a namespace level Authentication policy overrides the mesh level global policy, a service level policy overrides the namespace level and a service port level policy overrides the service specific Authentication policy. Destination rules allow you to specify the client side configuration based on host names where the highest precedence is the Destination rule defined in the client namespace then the server namespace and finally the global default Destination rule. On top of that, if you have conflicting Authentication policies or Destination rules, the system behavior can be indeterminate. A mismatch in Authentication policy and Destination rule can lead to subtle traffic failures which are difficult to debug and diagnose. Aspen Mesh makes it easy to understand mTLS status and avoid any configuration errors.

Editing these complex configuration files in YAML can be tricky and only compound the problem at hand. In order to simplify how you configure these resources and incrementally adopt mutual TLS in your environment, we are releasing a new feature which enables our customers to specify a service port (via APIs or UI) and their desired mTLS state (enabled or disabled). The Aspen Mesh platform automatically generates the correct set of configurations needed (Authentication policy and/or Destination rules) by inspecting the current state and configuration of your cluster. You can then view the generated YAMLs, edit as needed and store them in your CI system or apply them manually as needed. This feature removes the hassle of learning complex Istio resources and their interaction patterns, and provides you with valid, non-conflicting and functional Istio configuration.

Customers that we talk to are in various stages of migrating to a microservices architecture or Kubernetes environment which results in a hybrid environment where you have services which are consumed by clients not in the mesh or are deployed outside the Kubernetes environment, so some services require a different mTLS policy. Our hosted dashboard makes it easy for users to identify services and workloads which have mTLS turned on or off and then easily create configuration using the above workflow to change the mTLS state as needed.

If you’re an existing customer, please upgrade your cluster to our latest release (Aspen Mesh 1.1.3-am2) and login to the dashboard to start using the new capabilities.

If you’re interested in learning about Aspen Mesh and incrementally adopting mTLS in your cluster, you can sign up for a beta account here.


Securing Containerized Applications With Service Mesh

The self-contained, ephemeral nature of microservices comes with some serious upside, but keeping track of every single one is a challenge, especially when trying to figure out how the rest are affected when a single microservice goes down. The end result is that if you’re operating or developing a microservices architecture, there’s a good chance part of your days are spent wondering what your services are up to.

With the adoption of microservices, problems also emerge due to the sheer number of services that exist in large systems. Problems like security, load balancing, monitoring and rate limiting that had to be solved once for a monolith, now have to be handled separately for each service.

The technology aimed at addressing these microservice challenges has been  rapidly evolving:

  1. Containers facilitate the shift from monolith to microservices by enabling independence between applications and infrastructure.
  2. Container orchestration tools solve microservices build and deploy issues, but leave many unsolved runtime challenges.
  3. Service mesh addresses runtime issues including service discovery, load balancing, routing and observability.

Securing Services with a Service Mesh

A service mesh provides an advanced toolbox that lets users add security, stability and resiliency to containerized applications. One of the more common applications of a service mesh is bolstering cluster security. There are 3 distinct capabilities provided by the mesh that enable platform owners to create a more secure architecture.

Traffic Encryption  

As a platform operator, I need to provide encryption between services in the mesh. I want to leverage mTLS to encrypt traffic between services. I want the mesh to automatically encrypt and decrypt requests and responses, so I can remove that burden from my application developers. I also want it to improve performance by prioritizing the reuse of existing connections, reducing the need for the computationally expensive creation of new ones. I also want to be able to understand and enforce how services are communicating and prove it cryptographically.

Security at the Edge

As a platform operator, I want Aspen Mesh to add a layer of security at the perimeter of my clusters so I can monitor and address compromising traffic as it enters the mesh. I can use the built in power of Kubernetes as an ingress controller to add security with ingress rules such as allowlisting and denylisting. I can also apply service mesh route rules to manage compromising traffic at the edge. I also want control over egress so I can dictate that our network traffic does not go places it shouldn't (denylist by default and only talk to what you allowlist).

Role Based Access Control (RBAC)

As the platform operator, It’s important that I am able to provide the level of least privilege so the developers on my platform only have access to what they need, and nothing more. I want to enable controls so app developers can write policy for their apps and only their apps so that they can move quickly without impacting other teams. I want to use the same RBAC framework that I am familiar with to provide fine-grained RBAC within my service mesh.

How a Service Mesh Adds Security

You’re probably thinking to yourself, traffic encryption and fine-grained RBAC sound great, but how does a service mesh actually get me to them? Service meshes that leverage a sidecar approach are uniquely positioned intercept and encrypt data. A sidecar proxy is a prime insertion point to ensure that every service in a cluster is secured, and being monitored in real-time. Let’s explore some details around why sidecars are a great place for security.

Sidecar Is a Great Place for Security

Securing applications and infrastructure has always been daunting, in part because the adage really is true: you are only as secure as your weakest link.  Microservices are an opportunity to improve your security posture but can also cut the other way, presenting challenges around consistency.  For example, the best organizations use the principle of least privilege: an app should only have the minimum amount of permissions and privilege it needs to get its job done.  That's easier to apply where a small, single-purpose microservice has clear and narrowly-scoped API contracts.  But there's a risk that as application count increases (lots of smaller apps), this principle can be unevenly applied. Microservices, when managed properly, increase feature velocity and enable security teams to fulfill their charter without becoming the Department of No.

There's tension: Move fast, but don't let security coverage slip through the cracks.  Prefer many smaller things to one big monolith, but secure each and every one.  Let each team pick the language of their choice, but protect them with a consistent security policy.  Encourage app teams to debug, observe and maintain their own apps but encrypt all service-to-service communication.

A sidecar is a great way to balance these tensions with an architecturally sound security posture.  Sidecar-based service meshes like Istio and Linkerd 2.0 put their datapath functionality into a separate container and then situate that container as close to the application they are protecting as possible.  In Kubernetes, the sidecar container and the application container live in the same Kubernetes Pod, so the communication path between sidecar and app is protected inside the pod's network namespace; by default it isn't visible to the host or other network namespaces on the system.  The app, the sidecar and the operating system kernel are involved in communication over this path.  Compared to putting the security functionality in a library, using a sidecar adds the surface area of kernel loopback networking inside of a namespace, instead of just kernel memory management.  This is additional surface area, but not much.

The major drawbacks of library approaches are consistency and sprawl in polyglot environments.  If you have a few different languages or application frameworks and take the library approach, you have to secure each one.  This is not impossible, but it's a lot of work.  For each different language or framework, you get or choose a TLS implementation (perhaps choosing between OpenSSL and BoringSSL).  You need a configuration layer to load certificates and keys from somewhere and safely pass them down to the TLS implementation.  You need to reload these certs and rotate them.  You need to evaluate "information leakage" paths: does your config parser log errors in plaintext (so it by default might print the TLS key to the logs)?  Is it OK for app core dumps to contain these keys?  How often does your organization require re-keying on a connection?  By bytes or time or both?  Minimum cipher strength?  When a CVE in OpenSSL comes out, what apps are using that version and need updating?  Who on each app team is responsible for updating OpenSSL, and how quickly can they do it?  How many apps have a certificate chain built into them for consuming public websites even if they are internal-only?  How many Dockerfiles will you need to update the next time a public signing authority has to revoke one?  slowloris?

Your organization can do all this work.  In fact, parts probably already have - above is our list of painful app security experiences but you probably have your own additions.  It is a lot of cross-organizational effort and process to get it right.  And you have to get it right everywhere, or your weakest link will be exploited.  Now with microservices, you have even more places to get it right.  Instead, our advice is to focus on getting it right once in the sidecar, and then distributing the sidecar everywhere, and get back to adding business value instead of duplicating effort.

There are some interesting developments on the horizon like the use of kernel TLS to defer bulk and some asymmetric crypto operations to the kernel.  That's great:  Implementations should change and evolve.  The first step is providing a good abstraction so that apps can delegate to lower layers. Once that's solid, it's straightforward to move functionality from one layer to the next as needed by use case, because you don't perturb the app any more.  As precedent, consider TCP Segmentation Offload, which lets the network card manage splitting app data into the correct size for each individual packet.  This task isn't impossible for an app to do, but it turns out to be wasted effort.  By deferring TCP segmentation to the kernel, it left the realm of the app.  Then, kernels, network drivers, and network cards were free to focus on the interoperability and semantics required to perform TCP segmentation at the right place.  That's our position for this higher-level service-to-service communication security: move it outside of the app to the sidecar, and then let sidecars, platforms, kernels and networking hardware iterate.

Envoy Is a Great Sidecar

We use Envoy as our sidecar because it's lightweight, has some great features and good API-based configurability.  Here are some of our favorite parts about Envoy:

  • Configurable TLS Parameters: Envoy exposes all the TLS configuration points you'd expect (cipher strength, protocol versions, curves).  The advantage to using Envoy is that they're configured the same way for every app using the sidecar.
  • Mutual TLS: Typically TLS is used to authenticate the server to the client, and to encrypt communication.  What's missing is authenticating the client to the server - if you do this, then the server knows what is talking to it.  Envoy supports this bi-directional authentication out of the box, which can easily be incorporated into a SPIFFE system.  In today's complex and cloud datacenter, you're better off if you trust things based on cryptographic proof of what they are, instead of network perimeter protection of where they called from.
  • BoringSSL: This fork of OpenSSL removed huge amounts of code like implementations of obsolete ciphers and cleaned up lots of vestigial implementation details that had repeatedly been the source of security vulnerabilities.  It's a good default choice if you don't need any OpenSSL-specific functionality because it's easier to get right.
  • Security Audit: A security audit can't prove the absence of vulnerabilities but it can catch mistakes that demonstrate either architectural weaknesses or implementation sloppiness.  Envoy's security audit did find issues but in our opinion indicated a high level of security health.
  • Fuzzed and Bountied: Envoy is continuously fuzzed (exposed to malformed input to see if it crashes) and covered by Google's Patch Reward security bug bounty program.
  • Good API Granularity: API-based configuration doesn't mean "just serialize/deserialize your internal state and go."  Careful APIs thoughtfully map to the "personas" of what's operating them (even if those personas are other programs).  Envoy's xDS APIs in our experience partition routing behavior from cluster membership from secrets.  This makes it easy to make well-partitioned controllers.  A knock-on benefit is that it is easy in our experience to debug and test Envoy because config constructs usually map pretty clearly to code constructs.
  • No garbage collector: There are great languages with automatic memory management like Go that we use every day.  But we find languages like C++ and Rust provide predictable and optimizable tail latency.
  • Native Extensibility via Filters: Envoy has layer 4 and layer 7 extension points via filters that are written in C++ and linked into Envoy.
  • Scripting Extensibility via Lua: You can write Lua scripts as extension points as well.  This is very convenient for rapid prototyping and debugging.

One of these benefits deserves an even deeper dive in a security-oriented discussion.  The API granularity of Envoy is based on a scheme called "xDS" which we think of as follows:  Logically split the Envoy config API based on the user of that API.  The user in this case is almost always some other program (not a human), for instance a Service Mesh control plane element.

For instance, in xDS listeners ("How should I get requests from users?") are separated from clusters ("What pods or servers are available to handle requests to the shoppingcart service?").  The "x" in "xDS" is replaced with whatever functionality is implemented ("LDS" for listener discovery service).  Our favorite security-related partitioning is that the Secret Discovery Service can be used for propagating secrets to the sidecars independent of the other xDS APIs.

Because SDS is separate, the control plane can implement the Principle of Least Privilege: nothing outside of SDS needs to handle or have access to any private key material.

Mutual TLS is a great enhancement to your security posture in a microservices environment.  We see mutual TLS adoption as gradual - almost any real-world app will have some containerized microservices ready to join the service mesh and mTLS on day one.  But practically speaking, many of these will depend on mesh-external services, containerized or not.  It is possible in most cases to integrate these services into the same trust domain as the service mesh, and oftentimes these components can even participate in client TLS authentication so you get true mutual TLS.

In our experience, this happens by gradually expanding the "circle" of things protected with mutual TLS.  First, stateless containerized business logic, next in-cluster third party services, finally external state stores like bare metal databases.  That's why we focus on making the state of mTLS easy to understand in Aspen Mesh, and provide assistants to help you detect configuration mishaps.

What Lives Outside the Sidecar?

You need a control plane to configure all of these sidecars.  In some simple cases it may be tempting to do this with some CI integration to generate configs plus DNS-based discovery.  This is viable but it's hard to do rapid certificate rotation.  Also, it leaves out more dynamic techniques like canaries, progressive delivery and A/B testing.  For this reason, we think most real-world applications will include an online control plane that should:

  • Disseminate configuration to each of the sidecars with a scalable approach.
  • Rotate sidecar certificates rapidly to reduce the value to an attacker of a one-time exploit of an application.
  • Collect metadata on what is communicating with what.

A good security posture means you should be automating some work on top of the control plane. We think these things are important (and built them into Aspen Mesh):

  • Organizing information to help humans narrow in on problems quickly.
  • Warning on potential misconfigurations.
  • Alerting when unhealthy communication is observed.
  • Inspect the firehose of metadata for surprises - these patterns could be application bugs or security issues or both.

If you’re considering or going down the Kubernetes path, you should be thinking about the unique security challenges that comes with microservices running in a Kubernetes cluster. Kubernetes solves many of these, but there are some critical runtime issues that a service mesh can make easier and more secure. If you would like to talk about how the Aspen Mesh platform and team can address your specific security challenge, feel free to find some time to chat with us.  Or to learn more, get the free white paper on achieving Zero-trust security for containerized applications here.


Leveraging Service Mesh To Address HIPAA Security Requirements

Building a product utilizing a distributed microservice architecture for the healthcare industry, while following the requirements set forth in the Health Insurance Portability and Accountability Act (HIPAA), is hard. Trust me, I have felt the pain. I’ve spent the majority of my career building, securing and ensuring compliance for products in highly regulated industries, including healthcare. The healthcare industry is a necessity for all, which is causing it to grow at a rapid pace as new advancements are made. This is great for our health and wellbeing, but it starts to pose new challenges for organizations that process and store sensitive data such as Personally Identifiable Information (PII) and Electronic Protected Health Information (ePHI). What used to be to be a system of paper charts in manila envelopes stored in filing cabinets, is now a large interconnected system where patient medications, x-rays, surgeries, diagnosis and other health related data are transferred between internal and external entities. This advancement has allowed physicians to quickly provide other entities with your entire medical history, so you receive the best care possible, as quickly as possible. But this exchange does not come without risk. Anytime you make something more accessible, you also introduce new attack surfaces and points of failure, allowing data to be leaked and increasing the possibility of malicious attacks.

The HIPAA Security Rule was created to help address this new risk. It mandates that organizations that process or store ePHI follow certain safeguards to protect sensitive data.

The technical safeguard standards introduced by the Security Rule include:

  • Authentication - verification of the identity of the actor seeking access to protected data.
  • Authorization - verification that the actor is allowed to access the requested protected data.
  • Audit Controls - mechanisms for recording and examining activities pertaining to protected data within the system.
  • Data Integrity - protecting the data from being altered or destroyed in an unauthorized manner.

Implementing these safeguards may seem like an obvious thing to do when processing or storing sensitive data, but all too often they are overlooked or may be deemed too difficult, expensive and/or time consuming to implement with available resources. No matter the reason, this is a violation in the eyes of the U.S Department of Health and Human Services (HHS) Office for Civil Rights (OCR) and can result in fines up to $1.5 million a year for each violation and can even result in criminal charges. Fortunately, a service mesh helps address many of these standards in a way that requires less effort than building custom controls, and is also less error prone.

Let’s take a look at how you can leverage Aspen Mesh, the enterprise-ready service mesh built on Istio, to easily implement controls to address many of these standards that would otherwise require significant development effort and expertise.

Authentication
As briefly discussed, authentication is the verification of the identity of the actor seeking access to protected data. With Aspen Mesh, you can easily configure mesh wide service-to-service authentication and end-user authentication with little effort. In fact, if you use the recommended default Aspen Mesh installation, it will enable mesh wide mTLS automatically without requiring any code changes.

Now that you have service-to-service authentication and transport encryption enabled, the next step is to enable end-user authentication.

Below is an example of how you would enable end-user authentication on a Patient Check-in Service using an external Identity Management Service that supports JWTs (e.g. Azure Active Directory B2C, Amazon Cognito, Auth0, Okta, GSuite), so reception personnel can securely login and check-in patients as they arrive.

1. You’re going to need to make note of the JWT Issuer and JWK URI from your User Directory Service.
2. Create and apply a Policy called patients-checkin-user-auth that configures end user authentication to the Patient Check-in Service using your JWT supported Identity Management Service of choice.

apiVersion: "authentication.istio.io/v1alpha1"
kind: "Policy"
metadata:
  name: "patients-checkin-user-auth"
spec:
  targets:
  - name: patient-checkin
  peers:
  - mtls:
  origins:
  - jwt:
      issuer: "<REPLACE_WITH_YOUR_JWT_SUPPORTED_IDENTITY_MANAGEMENT_SERVICE_ISSUER>"
      jwksUri: "<REPLACE_WITH_YOUR_JWT_SUPPORTED_IDENTITY_MANAGEMENT_SERVICE_USER_DIRECTORY_JWKS_URI>"
  principalBinding: USE_ORIGIN

3. Ensure that the Patient Check-in frontend application places the JWT token in the Authorization header in http requests to the backend services
4. That’s it!

Authorization
Aspen Mesh provides flexible and fine-grained Role-Based Access Control (RBAC) via centralized policy management. With policy control, you can easily define what services are allowed to communicate, what methods services can call, rate limit requests and define and enforce quotas.

Below is a simple example of how a Patient Check-in Service can make GET, PUT, and POST requests to the Patients Service, but can’t make DELETE requests. While the Admin Service can make GET, POST, PUT, and DELETE requests to the Patients Service.  

1. Create a ServiceRole called patient-service-querie which allows making GET, PUT, POST requests to the Patients Service.

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: patient-service-querier
  namespace: default
spec:
  rules:
  - services: ["patients.default.svc.cluster.local"]
    methods: ["GET", “PUT”, “POST”]

2. Create another ServiceRole called patients-admin that allows GET, POST, PUT, and DELETE requests to the Patients Service.

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: patients-admin
  namespace: default
spec:
  rules:
  - services: ["patients.default.svc.cluster.local"]
    methods: ["GET", "POST", "PUT", DELETE]

3. Create a ServiceRoleBinding called bind-patient-service-querier which assigns patient-querier role to the cluster.local/ns/default/sa/patient-check-in service account, which represents the Patient Check-In Service.

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRoleBinding
metadata:
  name: bind-patient-service-querier
  namespace: default
spec:
  subjects:
  - user: "cluster.local/ns/default/sa/patient-check-in"
  roleRef:
    kind: ServiceRole
    name: "patient-querier"

4. Lastly we’ll create another ServiceRoleBinding called bind-patient-service-admin which assigns patient-admin role to the cluster.local/ns/default/sa/admin service account, which represents the Admin Service.

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRoleBinding
metadata:
  name: bind-patient-service-admin
  namespace: default
spec:
  subjects:
  - user: "cluster.local/ns/default/sa/admin"
  roleRef:
    kind: ServiceRole
    name: "patient-admin"

As you can see, you can quickly and effectively add Authorization between services in your mesh without any custom development work.

Audit Controls
Keeping audit records of data access is one of the key requirements for HIPAA compliance, as well as a security best practice. With Aspen Mesh, you get a single source of truth with in-depth tracing between all services within the mesh. Traces can be accessed and exported via the ‘Tracing’ tab on the Aspen Mesh Dashboard or API. You may still need to add the corresponding audit logs for specific actions that happen within a service to comply with all of the requirements, but at least you have reduced the amount of engineering effort spent on non-functional but essential tasks.

Data Integrity
Data integrity and confidentiality is arguably one of the most critical requirements of HIPAA. If sensitive data such as medications, blood type or allergies are modified by or leaked to an unauthorized user, it could be detrimental to the patient. With Aspen Mesh you can quickly and easily enable transport encryption, service-to-service authentication, authorization and monitoring so you can more easily comply with HIPAA requirements and protect patient data.

Aspen Mesh Makes it Easier to Implement and Scale a Secure HIPAA Compliant Microservice Environment
Building a HIPAA compliant microservice architecture at scale is a serious challenge without the right tools. Having to ensure each service adheres to both organizational and regulatory compliance requirements is not an easy task.  

Achieving HIPAA compliance involves addressing a number of technical security requirements such as network protection, encryption and key management, identification and authorization of users and auditing access to systems. These are all distinct development efforts that can be hard to achieve individually, but even harder to achieve as a coordinated team. The good news is, with the help of Aspen Mesh, your engineering team can spend less time building and maintaining non-functional yet essential features, and more time building features that provide direct value to your customers. 

To learn more, get the free white paper on achieving Zero-trust security for containerized applications here.


Service Mesh Security: Addressing Attack Vectors with Istio

As you break apart your monolith into microservices, you'll gain a slew of advantages such as scalability, increased uptime and better fault isolation. A downside of breaking applications apart into smaller services is that there is a greater area for attack. Additionally, all the communication that used to take place via function calls within the monolith is now exposed to the network. Adding security that addresses this must be a core consideration on your microservices journey.

One of the key benefits of Istio, the open source service mesh that Aspen Mesh is built on, is that it provides unique service mesh security and policy enforcement to microservices. An important thing to note is that while a service mesh adds several important security features, it is not the end-all-be-all for microservices security. It’s important to also consider a strategy around network security (a good read on how the network can help manage microservices), which can detect and neutralize attacks on the service mesh infrastructure itself, to ensure you’re entirely protected against today’s threats.

So let’s look at the attack vectors that Istio addresses, which include traffic control at the edge, traffic encryption within the mesh and layer-7 policy control.

Security at the Edge
Istio adds a layer of security that allows you to monitor and address compromising traffic as it enters the mesh. Istio integrates with Kubernetes as an ingress controller and takes care of load balancing for ingress. This allows you to add a level of security at the perimeter with ingress rules. You can apply monitoring around what is coming into the mesh and use route rules to manage compromising traffic at the edge.

To ensure that only authorized users are allowed in, Istio’s Role-Based Access Control (RBAC) provides flexible, customizable control of access at the namespace-level, service-level and method-level for services in the mesh. RBAC provides two distinct capabilities: the RBAC engine watches for changes on RBAC policy and fetches the updated RBAC policy if it sees any changes, and authorizes requests at runtime, by evaluating the request context against the RBAC policies, and returning the authorization result.

Encrypting Traffic
Security at the edge is a good start, but if a malicious actor gets through, Istio provides defense with mutual TLS encryption of the traffic between your services. The mesh can automatically encrypt and decrypt requests and responses, removing that burden from the application developer. It can also improve performance by prioritizing the reuse of existing, persistent connections, reducing the need for the computationally expensive creation of new ones.

Istio provides more than just client server authentication and authorization, it allows you to understand and enforce how your services are communicating and prove it cryptographically. It automates the delivery of the certificates and keys to the services, the proxies use them to encrypt the traffic (providing mutual TLS), and periodically rotates certificates to reduce exposure to compromise. You can use TLS to ensure that Istio instances can verify that they’re talking to other Istio instances to prevent man-in-the-middle attacks.

Istio makes TLS easy with Citadel, the Istio Auth controller for key management. It allows you to secure traffic over the wire and also make strong identity-based authentication and authorization for each microservice.

Policy Control and Enforcement
Istio gives you the ability to enforce policy at the application level with layer-7 level control. Applying policy at the this level is ideal for service routing, retries, circuit-breaking, and for security that operates at the application layer, such as token validation. Istio provides the ability to set up allowlists and denylists so you can let in what you know is safe and keep out what you know isn’t.

Istio’s Mixer enables integrating extensions into the system and lets you declare policy constraints on network, or service behavior, in a standardized expression language. The benefit is that you can funnel all of those things through a common API which enables you to cache policy decisions at the edge of the service so, if the downstream policy systems start to fail, the network stays up.

Istio addresses some key concerns that arise with microservices. You can make sure that only the services that are supposed to talk to each other are talking to each other. You can encrypt those communications to secure against attacks that can occur when those services interact, and you can apply application-wide policy. While there are other, manual, ways to accomplish much of this, the beauty of a mesh is that is brings several capabilities together and lets you apply them in a manner that is scalable.

At Aspen Mesh, we’re working on some new capabilities to help you get the most out of the security features in Istio. We’ll be posting something on that in the near future so check back in on the Aspen Mesh blog. Or to learn more, get the free white paper on achieving Zero-trust security for containerized applications here.