Aspen Mesh digital transformation service mesh

Digital Transformation: How Service Mesh Can Help

Your Company’s Digital Transformation

It’s happening everywhere, and it’s happening fast. In order to meet consumers head on in the best, most secure ways, enterprises are jumping on the digital transformation train (check out this Forrester report). 

Several years ago, digital transformations saw companies moving from monolithic architectures towards microservices and Kubernetes, but service mesh was in its infancy. No one knew they'd need something to help manage service-to-service communication. Now, with increasing complexity and demands coupled with thinly-stretched resources or teams without service mesh expertise, supported service mesh is becoming a good solution for many--especially for DevOps teams.

Service Mesh for DevOps

"DevOps" is a term used to describe the business relationship between development and IT operations. Mostly, the term is used when referring to improving communication and collaboration between the two teams. But while Dev is responsible for creating new functionality to drive business, Ops is often the unsung--but extremely important--hero behind the scenes. In IT Ops, you’re on the hook for strategy development, system design and performance, quality control, direction and coordination of your team all while collaborating with the Dev team and other internal stakeholders to achieve your business’s goals and drive profitability. Ultimately, it’s the Dev and Ops teams who are responsibility to ensure that teams are communicating effectively, systems are monitored correctly, high customer satisfaction is achieved and projects and issue resolution are completed on time. A service mesh can help with this by enabling DevOps.

Integrating a Service Mesh: Align with Business Objectives

As you think about adopting a service mesh, keep in mind that your success over time is largely dependent on aligning with your company’s business objectives. Sharing business objectives like these with your service mesh team will help to ensure you get--and keep--the features and capabilities that you really need, when you need them, and that they stay relevant.

What are some of your company’s business objectives? Here are three we’ve identified that a service mesh can help to streamline:

1. Automating More Process (i.e. Removing Toil)
Automating processes frees up your team from mundane tasks so they can focus on more important projects. Automation can save you time and money.

2. Increasing Infrastructure Performance
Building and maintaining a battle-tested environment is key to your end users experience, and therefore churn or customer retention rates and your company’s bottom line.

In addition, much of your time is spent developing strategies to monitor your systems and working through issue resolution as quickly as possible--whether issues pop up during the workday, or in the middle of the night. Fortunately, because service mesh come with observability, security and resilience features, it can help alleviate these responsibilities, decreasing MTTD and MTTR.

3. Maintaining Delivery to Customers
Reducing friction in the user experience is the name of the game these days, so UX and reliability are key to keeping your end users happy. If you’re looking at a service mesh, you’re already using a microservices architecture, and you’re likely using Kubernetes clusters. But once those become too complex in production--or don’t have all the features you need-- it’s time to add a service mesh into the mix. Service mesh’s observability features like cluster health monitoring, service traffic monitoring, easy debugging and root cause identification with distributed tracing help with this. In addition, an intuitive UI is key to surfacing these features in a way that is easy to understand and manipulate, so make sure you’re looking at a service mesh that’s easy for your Dev team to use. This will help provide a more seamless (and secure) experience for your end users.

Evolution; Not Revolution

How do you actually go about approaching the process of integrating a service mesh? What will drive success is for you to have agility and stability. But that can be a tall order, so it can be helpful to approach integrating a service mesh as evolution, rather than revolution. Three key areas to consider while you’re evaluating a service mesh include:

  1. Mitigating risk
  2. Production readiness
  3. Policy frameworks

Mitigating Risk
Risk can be terrifying, so it’s imperative to take steps to ensure that risk is mitigated as much as possible. The only time your company should be making headlines is because of good news. Ensuring security, compliance, and data integrity is the way to go. With security and compliance at top of mind for many, it’s important to address security head on. 

With a well-designed enterprise service mesh, you can expect plenty of security, compliance and policy features so it’s easy for your company to get a zero-trust network. Features can include anything from ensuring the principle of least privilege and secure default settings to technical features such as fine-grained RBAC and incremental mTLS.

Production Readiness
Your applications are ready to be used by your end users, and your technology stack needs to be ready too. What makes a real impact here is reliability. Service mesh features like dynamic request routing, fast retries, configuration vetters, circuit breaking and load balancing greatly increase the resiliency of microservice architectures. Support is also a feature that some enterprises will want to consider in light of whether service mesh expertise is a core in-house skill for the business. Having access to an expert support team can make a tremendous difference in your production readiness and your end users’ experiences.

Policy Frameworks
While configuration is useful for setting up how a system operates, policy is useful in dictating how a system responds when something happens. With a service mesh, the power of policy and configuration combined provides capabilities that can drive outcome-based behavior from your applications. A policy catalog can accelerate this behavior, while analytics examines policy violations and understands the best actions to take. This improves developer productivity with canary, authorization and service availability policies.

How to Measure Service Mesh Success

No plan is complete without a way to measure, iterate and improve your success over time. So how do you go about measuring the success of your service mesh? There are a lot of factors to take into consideration, so it’s a good idea to talk to your service mesh provider in order to leverage their expertise. But in the meantime, there are a few things you can consider to get an idea of how well your service mesh is working for you. Start by finding a good way to measure 1) how your security and compliance is impacted, 2)  how much you’re able to change downtime and 3) differences you see in your efficiency.

Looking for more specific questions to ask? Check out the eBook, Getting the Most Out of Your Service Mesh for ideas on the right questions to ask and what to measure for success.


Service Mesh Landscape - Aspen Mesh

The Service Mesh Landscape

Where A Service Mesh Fits in the Landscape

Service mesh is helping to take the cloud native and open source communities to the next level, and we’re starting to see increased adoption across many types of companies -- from start-ups to the enterprise. 

For any company, while a service mesh overlaps, complements, and in some cases replaces many tools that are commonly used to manage microservices, many technologies are involved in the service mesh landscape. In the following, we've explained some ways that a service mesh fits with other commonly used container tools.

Service Mesh Landscape - Aspen Mesh

Container Orchestration

Kubernetes provides scheduling, auto-scaling and automation functionality that solves most of the build and deploy challenges that come with containers. Where it leaves off, and where service mesh steps in, is solving some critical runtime challenges with containerized applications. A service mesh adds uniform metrics, distributed tracing, encryption between services and fine-grained observability of how your cluster is behaving at runtime. Read more about why container orchestration and service mesh are critical for cloud native deployments

API Gateway

The main purpose of an API gateway is to accept traffic from outside your network and distribute it internally. The main purpose of a service mesh is to route and manage traffic within your network. A service mesh can work with an API gateway to efficiently accept external traffic then effectively route that traffic once it’s in your network. There is some nuance in the problems solved at the edge with an API Gateway compared to service-to-service communication problems a service mesh solves within a cluster. But with the evolution of cluster-deployment patterns, these nuances are becoming less important. If you want to do billing, you’ll want to keep your API Gateway. But if you’re focused on routing and authentication, you can likely replace an API gateway with service mesh. Read more on how API gateways and service meshes overlap.

Global ADC

Load balancers focus on distributing workloads throughout the network and ensuring the availability of applications and services. Load balancers have evolved into Application Delivery Controllers (ADCs) that are platforms for application delivery, ensuring that an organization’s critical applications are highly available and secure. While basic load balancing remains the foundation of application delivery, modern ADCs offer much more enhanced functionality such as SSL/TLS offload, caching, compression, rate-shaping, intrusion detection, application firewalls and remote access into a single strategic point. A service mesh provides basic load balancing, but if you need advanced capabilities such as SSL/TLS offload and rate-shaping you should consider pairing an ADC with service mesh.

mTLS

Service mesh provides defense with mutual TLS encryption of the traffic between your services. The mesh can automatically encrypt and decrypt requests and responses, removing that burden from the application developer. It can also improve performance by prioritizing the reuse of existing, persistent connections, reducing the need for the computationally expensive creation of new ones. Aspen Mesh provides more than just client server authentication and authorization, it allows you to understand and enforce how your services are communicating and prove it cryptographically. It automates the delivery of the certificates and keys to the services, the proxies use them to encrypt the traffic (providing mutual TLS), and periodically rotates certificates to reduce exposure to compromise. You can use TLS to ensure that Aspen Mesh instances can verify that they’re talking to other Aspen Mesh instances to prevent man-in-the-middle attacks.

CI/CD

Modern Enterprises manage their applications via an agile, iterative lifecycle model.  Continuous Integration and Continuous Deployment systems automate the build, test, deploy and upgrade stages.  Service Mesh adds power to your CI/CD systems, allowing operators to build fine-grained deployment models like canary, A/B, automated dev/stage/prod promotion, and rollback.  Doing this in the service mesh layer means the same models are available to every app in the enterprise without app modification. You can also up-level your CI testing using techniques like traffic mirroring and fault injection to expose every app to complicated, hard-to-simulate fault patterns before you encounter them with real users.

Credential Management 

We live in an API economy, and machine-to-machine communication needs to be secure.  Microservices have credentials to authenticate themselves and other microservices via TLS, and often also have app-layer credentials to serve as clients of external APIs. It’s tempting to focus only on the cost of initially configuring these credentials, but don’t forget the lifecycle – rotation, auditing, revocation, responding to CVEs. Centralizing these credentials in the service mesh layer reduces scope and improves the security posture.

APM

Traditional Application Performance Monitoring tools provide a dashboard that surfaces data that allow users to monitor their applications in one place. A service mesh takes this one step further by providing observability. Monitoring is aimed at reporting the overall health of systems, so is best limited to key business and systems metrics derived from time-series based instrumentation. Observability focuses on providing highly granular insights into the behavior of systems along with rich context, perfect for debugging purposes. Aspen Mesh provides deep observability that allows you to understand current state of your system, and also provide a way to better understand system performance and behavior, even during the what can be perceived as normal operation of a system. Read more about the importance of observability in distributed systems.

Serverless

Serverless computing transforms source code into running workloads that execute only when called. The key difference between service mesh and serverless is that with serverless, a service can be scaled down to 0 instances if the system detects that it is not being used, thus saving you from the cost of continually having at least one instance running. Serverless can help organizations reduce infrastructure costs, while allowing developers to focus on writing features and delivering business value. If you’ve been paying attention to service mesh, these advantages will sound familiar. The goals with service mesh and serverless are largely the same – remove the burden of managing infrastructure from developers so they can spend more time adding business value. Read more about service mesh and serverless computing.

Learn More

If you'd like to learn more about how a service mesh can help you and your company, schedule a time to talk with one of our experts, or take a look at The Complete Guide to Service Mesh.


When Do You Need A Service Mesh - Aspen Mesh

When Do You Need A Service Mesh?

When You Need A Service Mesh - Aspen MeshOne of the questions I often hear is: "Do I really need a service mesh?" The honest answer is "It depends." Like nearly everything in the technology space (or more broadly "nearly everything"), this depends on the benefits and costs. But after having helped users progress from exploration to production deployments in many different scenarios, I'm here to share my perspective on which inputs to include in your decision-making process.

A service mesh provides a consistent way to connect, secure and observe microservices. Most service meshes are tightly integrated with an orchestration platform, commonly Kubernetes. There's no way around it; a service mesh is another thing, and at least part of your team will have to learn it. That's a cost, and you should compare that cost to the benefits of operational simplification you may achieve.

But apart from costs and benefits, what should you be asking in order to determine if you really need a service mesh? The number of microservices you’re running, as well as urgency and timing, can have an impact on your needs.

How Many Microservices?

If you're deploying your first or second microservice, I think it is just fine to not have a service mesh. You should, instead, focus on learning Kubernetes and factoring stateless containers out of your applications first. You will naturally build familiarity with the problems that a service mesh can solve, and that will make you much better prepared to plan your service mesh journey when the time comes.

If you have an existing application architecture that provides the observability, security and resilience that you need, then you are already in a good place. For you, the question becomes when to add a service mesh. We usually see organizations notice the toil associated with utility code to integrate each new microservice. Once that toil gets painful enough, they evaluate how they could make that integration more efficient. We advocate using a service mesh to reduce this toil.

The exact point at which service mesh benefits clearly outweigh costs varies from organization to organization. In my experience, teams often realize they need a consistent approach once they have five or six microservices. However, many users push to a dozen or more microservices before they notice the increasing cost of utility code and the increasing complexity of slight differences across their applications. And, of course, some organizations continue scaling and never choose a service mesh at all, investing in application libraries and tooling instead. On the other hand, we also work with early birds that want to get ahead of the rising complexity wave and introduce service mesh before they've got half-a-dozen microservices. But the number of microservices you have isn’t the only part to consider. You’ll also want to consider urgency and timing. 

Urgency and Timing

Another part of the answer to “When do I need a service mesh?” includes your timing. The urgency of considering a service mesh depends on your organization’s challenges and goals, but can also be considered by your current process or state of operations. Here are some states that may reduce or eliminate your urgency to use a service mesh:

  1. Your microservices are all written in one language ("monoglot") by developers in your organization, building from a common framework.
  2. Your organization dedicates engineers to building and maintaining org-specific tooling and instrumentation that's automatically built into every new microservice.
  3. You have a partially or totally monolithic architecture where application logic is built into one or two containers instead of several.
  4. You release or upgrade all-at-once after a manual integration process.
  5. You use application protocols that are not served by existing service meshes (so usually not HTTP, HTTP/2, gRPC).

On the other hand, here are some signals that you will need a service mesh and may want to start evaluating or adopting early:

  1. You have microservices written in many different languages that may not follow a common architectural pattern or framework (or you're in the middle of a language/framework migration).
  2. You're integrating third-party code or interoperating with teams that are a bit more distant (for example, across a partnership or M&A boundary) and you want a common foundation to build on.
  3. Your organization keeps "re-solving" problems, especially in the utility code (my favorite example: certificate rotation, while important, is no scrum team's favorite story in the backlog).
  4. You have robust security, compliance or auditability requirements that span services.
  5. Your teams spend more time localizing or understanding a problem than fixing it.

I consider this last point the three-alarm fire that you need a service mesh, and it's a good way to return to the quest for simplification. When an application is failing to deliver a quality experience to its users, how does your team resolve it? We work with organizations that report that finding the problem is often the hardest and most expensive part. 

What Next?

Once you've localized the problem, can you alleviate or resolve it? It's a painful situation if the only fix is to develop new code or rebuild containers under pressure. That's where you see the benefit from keeping resiliency capabilities independent of the business logic (like in a service mesh).

If this story is familiar to you, you may need a service mesh right now. If you're getting by with your existing approach, that’s great. Just keep in mind the costs and benefits of what you’re working with, and keep asking:

  1. Is what you have right now really enough, or are spending too much time trying to find problems instead of developing and providing value for your customers?
  2. Are your operations working well with the number of microservices you have, or is it time to simplify?
  3. Do you have critical problems that a service mesh would address?

Keeping tabs on the answers to these questions will help you determine if — and when — you really need a service mesh.

In the meantime if you're interested in learning more about service mesh, check out The Complete Guide to Service Mesh.


Aspen Mesh - Service Mesh Security and Complinace

Announcing Aspen Mesh 1.4.6 Security Update

Aspen Mesh is announcing the release of 1.4.6 which addresses important Istio security updates. Below are the details of the security fixes taken from Istio 1.4.6 security update.

Security Update: 

ISTIO-SECURITY-2020-003: Two Uncontrolled Resource Consumption and Two Incorrect Access Control Vulnerabilities in Envoy.

  • CVE-2020-8659: The Envoy proxy may consume excessive memory when proxying HTTP/1.1 requests or responses with many small (i.e. 1 byte) chunks. Envoy allocates a separate buffer fragment for each incoming or outgoing chunk with the size rounded to the nearest 4Kb and does not release empty chunks after committing data. Processing requests or responses with a lot of small chunks may result in extremely high memory overhead if the peer is slow or unable to read proxied data. The memory overhead could be two to three orders of magnitude more than configured buffer limits.
  • CVE-2020-8660: The Envoy proxy contains a TLS inspector that can be bypassed (not recognized as a TLS client) by a client using only TLS 1.3. Because TLS extensions (SNI, ALPN) are not inspected, those connections may be matched to a wrong filter chain, possibly bypassing some security restrictions.
  • CVE-2020-8661: The Envoy proxy may consume excessive amounts of memory when responding to pipelined HTTP/1.1 requests. In the case of illegally formed requests, Envoy sends an internally generated 400 error, which is sent to the Network::Connection buffer. If the client reads these responses slowly, it is possible to build up a large number of responses, and consume functionally unlimited memory. This bypasses Envoy’s overload manager, which will itself send an internally generated response when Envoy approaches configured memory thresholds, exacerbating the problem.
  • CVE-2020-8664: For the SDS TLS validation context in the Envoy proxy, the update callback is called only when the secret is received for the first time or when its value changes. This leads to a race condition where other resources referencing the same secret (e.g,. trusted CA) remain unconfigured until the secret’s value changes, creating a potentially sizable window where a complete bypass of security checks from the static (“default”) section can occur.
    • This vulnerability only affects the SDS implementation of Istio’s certificate rotation mechanism for Istio 1.4.5 and earlier which is only when SDS and mutual TLS are enabled. SDS is off by default and must be explicitly enabled by the operator in all versions of Istio prior to Istio 1.5. Istio’s default secret distribution implementation based on Kubernetes secret mounts is not affected by this vulnerability.

Bug Fix:

  • Fixed issue preventing Kubernetes nodes from being restarted

Minor Enhancements:

  • Allow private image registries to be specified more easily
  • Better checking of problematic names in Secure Ingress
  • Better mTLS status reporting
  • Improved classification of vetter notes

If you're already using Aspen Mesh, you can get the updates here.

 

New call-to-action
How Delphi Simplifies Kubernetes Security with Aspen Mesh

Customer Story: How Delphi Simplifies Kubernetes Security with Aspen Mesh

Delphi and Zero-Trust Security

Delphi delivers software solutions that help professional liability insurers streamline their operations and optimize their business processes. Operating in the highly regulated healthcare industry, privacy and compliance concerns such as HIPAA and APRA mandate a highly secure environment. As such, a Zero-trust environment is of utmost importance for Delphi and their customers. 

The infrastructure team at Delphi has fully embraced a cloud-native stack to deliver the Delphi Digital Platform to its customers. The team leverages Kubernetes to effectively manage builds and deploys. Delphi planned to use Kubernetes from the start, but was looking for a simpler security solution for their infrastructure that could be managed without implementations in each service. 

While Delphi was getting tremendous value from Kubernetes, they needed to find an easier way to bake security into the infrastructure. Taking advantage of a service mesh was the obvious solution to address this challenge, as it provides cluster-wide mTLS encryption. 

The team chose Istio to confront this problem, and while the initial solution included setting up a certificate at the load balancer, this had open http between the load balancer and service. Unfortunately, this was not acceptable in a highly regulated healthcare industry with strict requirements to keep personal data secure. 

Achieving Security with a Service Mesh

To solve these challenges, Delphi engaged with Aspen Mesh in order to implement an end-to-end encrypted solution, from Client to back end SaaS applications. This was achieved by enabling mTLS mesh-wide from service to service and creating custom Istio policy manifests to integrate cert-manager and Let's Encrypt for client-side encryption. As a result, Delphi is able to provide secure ingress integration for a multitenant B2C environment, allowing Delphi to deploy a fully scalable solution. 

[Read the Full Case Study Here]

This Aspen Mesh solution lets Delphi use Let’s Encrypt seamlessly with Istio, removing the need to consider building security into application development and placing it into an infrastructure solution that is highly scalable. Leveraging the power of Kubernetes, Istio and Aspen Mesh, the Delphi team is delivering a highly secure platform to their customers without the need to implement encryption in each service. 

“At this point, I look at Aspen Mesh as an extension of my team” 

- Bill Reeder, Delphi Technology Lead Architect