Helping Istio Sail

Around three years ago, we recognized the power of service mesh as a technology pattern to help organizations manage the next generation of software delivery challenges, and that led to us founding Aspen Mesh. And we know that efficiently managing high-scale Kubernetes applications requires the power of Istio. Having been a part of the Istio project since 0.2, we have seen countless users and customers benefit from the observability, security and traffic management capabilities that a service mesh like Istio provides. 

Our engineering team has worked closely with the Istio community over the past three years to help drive the stability and add new features that make Istio increasingly easy for users to adopt. We believe in the value that open source software — and an open community  —  brings to the enterprise service mesh space, and we are driven to help lead that effort through setting sound technical direction. By contributing expertise, code and design ideas like Virtual Service delegation and enforcing RBAC policies for Istio resources, we have focused much of our work on making Istio more enterprise-ready. One of these contributions includes Istio Vet, which was created and open sourced in early days of Aspen Mesh as a way to enhance Istio's user experience and multi resource configuration validation. Istio Vet proved to be very valuable to users, so we decided to work closely with the Istio community to create istioctl analyze in order to add key configuration analysis capabilities to Istio. It’s very exciting to see that many of these ideas have now been implemented in the community and part of the available feature set for broader consumption. 

As the Istio project and its users mature, there is a greater need for open communication around product security and CVEs. Recognizing this, we were fortunate to be able to help create the early disclosure process and lead the product security working group which ensures the overall project is secure and not vulnerable to known exploits. 

We believe that an open source community thrives when users and vendors are considered as equal partners. We feel privileged that we have been able to provide feedback from our customers that has helped accelerate Istio’s evolution. In addition, it has been a privilege to share our networking expertise from our F5’s heritage with the Istio project as maintainers and leads of important functional areas and key working groups.

The technical leadership of Istio is a meritocracy and we are honored that our sustained efforts have been recognized with - my appointment to the Technical Oversight Committee

As a TOC member, I am excited to work with other Istio community leaders to focus the roadmap on solving customer problems while keeping our user experience top of mind. The next, most critical challenges to solve are day two problems and beyond where we need to ensure smoother  upgrades, enhanced security and scalability of the system. We envision use cases emerging from industries like Telco and FinServ which will  push the envelope of technical capabilities beyond what many can imagine.

It has been amazing to see the user growth and the maturity of the Istio project over the last three years. We firmly believe that a more diverse leadership and an open governance in Istio will further help to advance the project and increase participation from developers across the globe.

The fun is just getting started and I am honored to be an integral part of Istio’s journey! 


How to Achieve Engineering Efficiency with a Service Mesh

How to Achieve Engineering Efficiency with a Service Mesh

As the idea for Aspen Mesh was formulating in my mind, I had the opportunity to meet with a cable provider’s engineering and operations teams to discuss the challenges they had operating their microservice architecture. When we all gathered in the large, very corporate conference room and exchanged the normal introductions, I could see that something just wasn’t right with the folks in the room. They looked like they had been hit by a truck. The reason for that is what turned this meeting into one of the most influential meetings of my life.

It turned out that the entire team had been up all night working on an outage in some of the services that were part of their guide application. We talked about the issue, how it manifested itself and what impact it had on their customers. But there was one statement that has stuck with me since: “The worst part of this 13-hour outage was that it took us 12 hours to get the right person on the phone; and only one hour to get it fixed…”

That is when I knew that a service mesh could solve this problem and increase the engineering efficiency for teams of all sizes. First, by ensuring that in day-to-day engineering and operations, experts were focused on what they were experts of. And second, when things went sideways, it was the strategic point in the stack that would have all the information needed to root-cause a problem — but also be the place that you could rapidly restore your system.

Day-to-Day Engineering and Operations

A service mesh can play a critical role in day-to-day engineering and operations activities, by streamlining processes, reducing test environments and allowing experts to perform their duties independent of application code cycles. This allows DevOps teams to work more efficiently, by allowing developers to focus on providing value to the company’s customers through applications and operators to provide value to their customers through improved customer experience, stability and security.

The properties of a service mesh can enable your organization to run more efficiently and reduce operating costs. Here are some ways a service mesh allows you to do this:

  • Canary testing of applications in production can eliminate expensive staging environments
  • Autoscaling of applications can ensure efficient use of resources.
  • Traffic management can eliminate duplicated coding efforts to implement retry-logic, load-balancing and service discovery.
  • Encryption and certificate management can be centralized to reduce overhead and the need to make application changes and redeployment for changing security policies.
  • Metrics and tracing gives teams access to the information they need for performance and capacity planning, and can help reduce rework and over-provisioning of resources.

As organizations continue to shift-left and embrace DevOps principles, it is important to have the right tools to enable teams to move as quickly and efficiently as possible. A service mesh helps teams achieve this by moving infrastructure-like features out of the individual services and into the platform. This allows teams to leverage them in a consistent and compliant manner; it allows Devs to be Devs and Ops to be Ops, so together they can truly realize the velocity of DevOps.

Reducing Mean-Time-To-Resolution

Like it or not, outages happen. And when they do, you need to be able to root-cause the problem, develop a fix and deploy it as quickly as possible to avoid violating your customer-facing SLAs and your internal SLOs. A service mesh is a critical piece of infrastructure when it comes to reducing your MTTR and ensuring the best possible user experience for your customers. Due to its unique position in the platform, sitting between the container orchestration and application, it has the unique ability to not only gather telemetry data and metrics, but also transparently implement policy and traffic management changes at run time. Here are some ways how:

  • Metrics can be collected by the proxy in a service mesh and used to understand where problems are in the application, show which services are underperforming or using too many resources, and help inform decisions on scaling and resource optimization.
  • Layer 7 traces can be collected throughout the application and correlated together, allowing teams to see exactly where in the call-flow failed.
  • Policy can allow platform teams to direct traffic — and in the case of outages, redirect traffic to other, healthier services.

All of this functionality can be collected and implemented consistently across services — and even clusters — without impacting the application or placing additional burden or requirements on application developers.

It has been said that a minute of downtime can cost an enterprise company up to $5600 per minute. In an extreme example, let’s think back to my meeting with the cable provider. If a service mesh could have enabled their team to get the right expert on the phone in half the time, they would have saved $2,016,000.00. That’s a big number, and more importantly, all of those engineers could have been home with their families that night, instead of in front of their monitors.


Announcing Aspen Mesh 1.5

Announcing Aspen Mesh 1.5

We’re excited to announce the release of Aspen Mesh 1.5 which is based on Istio’s latest LTS release 1.5 (specific tag 1.5.2). Istio 1.5 is the most feature-rich and stable release since Istio 1.1 and contains many enhancements and improvements that we are excited about. With this release, Istio continues to increase stability, performance and usability. At Aspen Mesh, we’re proud of the work that the Istio community has made and are committed to helping it become even better. We continue to represent users’ interests and improve the istio project as maintainers and working group leads for key areas, including Networking and Product Security. It has been amazing to see the growth of the Istio community and we look forward to being a part of its inclusive and diverse culture.

Given the changes in the release, we understand that it can be challenging for users to focus on their key issues and discover how the new features and capabilities relate to their needs while adding business value. We hope this blog will make it easier for our users to digest the sheer volume of changes in Istio 1.5 and understand the reasons why we have, at times, chosen to disable or not enable certain capabilities. We evaluate features and capabilities within Istio from release to release in terms of stability, readiness and backwards compatibility, and we’re eager to enable all capabilities once they are ready for enterprise consumption.

Key updates in Istio 1.5 include: 

  • The networking APIs were moved from v1alpha1 to v1beta1 showing that they are getting closer to stabilizing; 
  • Enabling Envoy to support filters written in WebAssembly (WASM); 
  • The istio operator continues to make major strides in installing and upgrading Istio, as well as istioctl; 
  • A new API model to secure mutual TLS (mTLS) and JWT use in your service mesh.

The number of features added or changed by Istio is long and exciting, so let’s dig deeper into the changes, rationale behind them and how it will impact you as an Aspen Mesh customer.


New Security APIs

Istio 1.5 introduces new security APIs (PeerAuthentication, RequestAuthentication) that work in conjunction with Authorization Policy resources to create a stronger security posture for your applications. We understand that customers have widely adopted Authentication Policies, and the pain of incrementally adopting mTLS as well as debugging issues encountered. We believe that the new set of APIs are a step forward in the evolution of application security. With the older APIs being deprecated in Istio 1.5— and with the intent to remove these APIs in Istio 1.6—it is imperative that customers understand the work required to ease this transition.

To ease migration from the v1alpha1 APIs to the v1beta1 APIs Aspen Mesh has updated our dashboard to reflect mTLS status using the new APIs, we have updated our Secure Ingress API to use the new security APIs and we have updated our configuration generation tool to help you incrementally adopt the new APIs.


Challenges with Older APIs

While Authentication Policies worked for most of the use cases, there were architectural issues with the API choices.

Authentication Policies used to be specified at the service level, but were applied at the workload level. This often caused confusion and potential policy escapes as all other Istio resources are specified and applied at the workload level. The new API addresses this concern and is consistent with the Istio configuration model.

Authentication Policies used to handle both end user and peer authentication causing confusion. The new API separates them out as they are different concerns that solve different sets of problems. This logical separation should make it easier to secure your application as the possibility of misconfiguration is reduced. For example, when configuring mTLS if there is a conflict between multiple PeerAuthentication resources, the strongest PeerAuthentication (by security enforcement) is applied. Previously conflict resolution was not defined and behavior wasn’t deterministic.

Authentication Policies used to validate end user JWT credentials and reject any request that triggered the rule and had no or invalid/expired JWT tokens. The new API separates the Authentication and Authorization pieces into two separate composable layers. This has the benefit of expressing all allow/deny policies in a single place, i.e. Authorization resources. The downside of this approach is that if a RequestAuthentication policy is applied, but no JWT token is presented, then the request is allowed unless there's a deny Authorization policy. However, separating authentication/identity and authorization follows traditional models of securing systems, with respect to access control. If you are currently using Authentication policies to reject requests without any JWT tokens, you will need to add a deny Authorization policy for achieving parity. While Istio has greatly simplified their security model, our Secure Ingress API provides a higher level abstraction for the typical use case. Ease of adoption is one of our core tenets and we believe that Secure Ingress does just that. Feel free to reach out to us with questions about how to do this.


Migration Towards New APIs

You don’t have to worry about resources defined by the existing Authentication Policy API as they will continue to work in Istio 1.5. Aspen Mesh supports the direction of the Istio community as it trends towards using workload selectors throughout its APIs. By adopting these new APIs we are hopeful that it will alleviate some application security concerns. 

With the removal of the v1alpha1 APIs in istio 1.6 (ETA June 2020) we are focused on helping our customers migrate towards using the new APIs. Our Secure Ingress Policy has already been ported to support the new APIs, as well as our tool to help you incrementally adopt mTLS. Note that our Secure Ingress controller preserves backward compatibility by creating both RequestAuthentication and Authorization policies as needed to ensure paths protected by JWT tokens cannot be accessed without tokens.

Users upgrading to Aspen Mesh 1.5.2 will see their MeshPolicy replaced with a mesh-wide PeerAuthentication resource. This was intentional, although upstream Istio chose not to remove a previously installed MeshPolicy, so that users can quickly migrate their cluster towards using the new security APIs. With that in mind, istioctl now includes analyzers to help you find and manage your MeshPolicy/Policy resources, giving you a deprecation message with instructions to migrate away from your existing policies.


Extending Envoy Through WASM

WASM support, as an idea, can be traced back to October 2018. A proxy ABI specification for WASM was also drafted with the idea that all proxies, and not just Envoy, can adopt WASM helping to further cement the Istio community as thought leaders in the service mesh industry. The Istio community also had major contributions to Envoy through the envoy-wasm repository to enable WASM support and is the fork of Envoy used in Istio 1.5. There are plans to have their hard work incorporated into the mainline Envoy repository.

This is a major feature as it will enable users to extend Envoy in new and exciting ways.

We applaud the Istio community for its work to deliver this capability. Previously writing a filter for Envoy required extensive knowledge of not only Envoy C++ code, but also how to build it, integrate it and deploy Envoy.

WebAssembly support in Envoy will allow users to create, share and extend Envoy to meet their needs. Support for creating Envoy extensions already exists with SDKs for C++, Rust, and AssemblyScript, and there are plans to increase the number of supported languages shortly. Istio is also continuing to build its supporting infrastructure around these Envoy features to easily integrate them into the Istio platform. There is also on-going work to move already existing Istio filters to WASM; in fact, Telemetry V2 is a WASM extension. Istio is eating its own dog food and the results speak for themselves. Expect future blogs around WASM capabilities in Istio as we learn more about its feature set and capabilities.

For more information, please see Istio’s WASM announcement


istiod and Helm Based Installs

Support for Helm v3-based installs continues to be a top priority for Aspen Mesh. Aspen Mesh Helm installation process is a seamless operation for your organization. Aspen Mesh 1.5 does not support the istiod monolith yet, but continues to ship with Istio’s microservice platform.

We are actively working with the Istio community to enable installations and upgrades from a microservice platform to the monolithic platform a first-class experience with minimal downtime in istio 1.6. This will ensure that our customers will have a stable and qualified release when they migrate to istiod.

As Istio has evolved, a decision was made to simplify Istio and transform its microservice platform to that of a monolith for operational simplification and overhead reduction. The upgrade path from microservices control plane to monolith (istiod) requires a fresh install and possible downtime. We want our customers to have a smooth upgrade and experience minimal downtime disruption.


Moving Towards a Simplified Operational System (istiod)

Benefits of a monolithic approach to end users, include a reduction in installation complexity and fewer services deployed which makes it easier for users to understand what is being deployed in their Kubernetes environment, especially as the intra-communication within the controlplane and the inter-communication between the control plane and data plane continues to grow in complexity. 

With the migration towards istiod, one of the key features that we anticipate will mature in istio 1.6 is the istio in-cluster Operator. In-cluster operators have quickly gained traction within the Kubernetes community and the Istio community has made a significant effort towards managing the lifecycle of Istio. A key feature that we are eager about is the ability to have multiple control planes, which would enable such things as canaries and potentially in-place upgrades. As the in-cluster operator continues to evolve we will continue to evaluate it and when we feel it is ready for enterprise use you should expect a more detailed blog to follow.


Other Features on the Horizon

As always, Istio and Aspen Mesh are constantly moving forward. Here is a preview of some new features we see on the horizon.


Mixer-less Telemetry

Many users have inquired about mixer-less telemetry, hence referred to as Telemetry V2. During testing, Aspen Mesh found an issue with TCP blackhole telemetry, others found an issue with HTTP blackhole telemetry and a few other minor issues were found. Aspen Mesh is committed to helping the Istio community test and fix issues found in Telemetry V2, but until feature parity is reached with Mixer telemetry we will continue to ship with Mixer, aka Telemetry V1. This is perhaps our most requested feature and we are eager to enable it. The work surrounding this has been a Herculean effort by the Istio community and we are eager to enable it in a future Aspen Mesh release.


Intelligent Inbound and Outbound Protocol Detection

While upstream Istio has previously enabled outbound intelligent protocol detection, Aspen Mesh has decided to disable it; the reasoning for this was detailed in our Protocol sniffing in production blog. For Istio 1.5, inbound intelligent protocol detection was also enabled. While this has the potential to be a powerful feature, Aspen Mesh believes that deterministic behavior is best defined by our end users. There have been a handful of security exploits related to protocol sniffing and we will continue to monitor its stability and performance implications and accordingly enable them if warranted in future releases.

For this and prior releases, we recommend our users continue to prefix the port name with the appropriate supported protocol as suggested here. You can also use our vetter to discover any missing port prefixes in your service mesh.


Next Steps for You

The Aspen Mesh 1.5 binaries are available for download here. If you have any questions about your upgrade path, feel free to reach out to us at support@aspenmesh.io.

If you’re new to Aspen Mesh, you can download a free 30-day trial here


Service Mesh University

Service Mesh University

We're Excited to Launch Service Mesh University

There’s a lot of talk -- and even more questions -- about service mesh these days.

Service Mesh University

What is it? Do you really need it? Who should own it? How do you get it running? Will it play nicely with your tech stack? How do you know it's working? How is it evolving?

Service mesh can be complex, so that’s why we’ve created Service Mesh University (SMU). This series of seven short classes enables you to find the answers to these questions, on-demand, at your own pace.

Each class is hosted by a different Aspen Mesh 'meshpert' and is tailored to each topic. In addition, transcripts of the videos, an outline and summary of each class, and extra links and materials where you can find additional related information is posted along with each video for you to take with you.

Why SMU?

Since 2017, Aspen Mesh has been at the forefront of service mesh technology. Aspen Meshers come from a myriad of startups and some of the most recognizable companies in the world, but the one thing we all have in common is a history of solving complex engineering and infrastructure challenges. Our engineers are experts in Istio, Envoy and Kubernetes, and we can help you get the most out of containerized applications. And we don't want to keep all that expertise to ourselves!

As a team intent upon driving improved operations through smarter and more efficient infrastructure, that is also reliable and easy to operate, we hope this new video collection helps you to learn how a service mesh can help you.

What You'll Learn

Class 101: Intro to Service Mesh (with Zach Jory and Rose Sawvel)

  • Learn what a service mesh is
  • Discover the basics of how a service mesh works
  • Explore functionality a service mesh provides

Class 201: Foundations of Service Mesh (with Shawn Wormke)

  • Learn about the service mesh landscape
  • Discover how a service mesh can help your team and your end users
  • Explore prerequisites you’ll need before getting started with a service mesh

Class 301: Service Mesh Architectures (with Andrew Jenkins)

  • Find out more about service mesh as the next tier of communication patterns for microservices
  • Learn about the Istio architecture and how it improves Kubernetes
  • Discover the different service mesh components and how they operate
  • See how service mesh enables more efficient engineering processes

Class 401: Security, Reliability, and Observability (with Granville Schmidt)

  • Learn about security, reliability and observability
  • Discover the power that a service mesh provides by combining these three elements

Class 501: Setting Up Your Service Mesh (with Jacob Delgado)

  • Learn about what you need before you start setting up the Aspen Mesh service mesh
  • Walk through how to set it up
  • Discover what it helps you do

Class 601: Maintaining and Improving Your Service Mesh (with Michael Davis)

  • Learn about common issues to watch out for when operating a service mesh
  • Discover best practices for keeping your service mesh up to date
  • Find out about integrations with other tools that could have an impact on how you use your service mesh

Class 701: The Future of Service Mesh (with Neeraj Poddar)

  • Learn about where service mesh is headed
  • Discover what that means for you if you’re already using a service mesh
  • Find out how technologies like service mesh are helping companies deliver greater value to their end users

We hope you'll join us to learn more about service mesh!

Please fill out the form below to access the classes.




Service Mesh Agility Stability Aspen Mesh

Agility with Stability: Why You Need a Service Mesh

“Move fast and break things” may have worked for Facebook, but it's a non-starter for most companies.  Move fast? Yes. Break things? Not so much.  

You're probably working at a company that's trying to balance the ability to move quickly — responding to customer needs in an ever-changing world — with the need to provide secure, reliable services that are always there when customers need them.  You're trying to achieve agility with stability and it's why you may need a service mesh; a configurable infrastructure layer for a microservices application that helps make communication between service instances flexible, reliable, and fast.

We work with a variety of customers, from small startups to large enterprises, but they’re all deploying and operating microservices applications at scale. No matter the company size or industry, we tend to get three common requirements:

  • I need to move quickly.
  • I need to quickly identify and solve problems.
  • I need everything to be secure.

Let's explore these three needs, as we often hear about them from platform owners and DevOps leads who are building and running next generation microservices platforms.

I Need to Move Quickly

This is where it all starts. You're moving to microservices so that you can push code to your customers faster than your competition.  This is how you differentiate and win in a market where the way you interact with your customers is through applications.

To win in these markets, you want your application developers focused on solving customer problems and delivering new application value.  You don't want them thinking about how they're going to capture metrics for their app, or what library they're going to use to make it more fault tolerant, or how they're going to secure it.  This is what you want your platform team solving - at scale - across your entire engineering organization.

A service mesh enables your platform team to implement organization-wide solutions like observability, advanced developer tooling, and security. This allows your application developers to focus on pushing the new features that will help you win in the market.  Focus drives speed and a service mesh creates focus.

I Need to Be Able to Quickly Identify Problems and Solve Them

In breaking your monolith down into microservices, things get more complex.  That becomes painfully obvious the first time you need to respond to a production issue and it takes hours of trolling through log files to even figure out where the problem occurred - let alone getting to solve it.  

A service mesh can provide you with the metrics and observability you need to effectively identify, troubleshoot, and solve problems in your microservices environment.  At the most basic level, you'll get metrics, distributed tracing, and a service graph to work with. For more advanced implementations, you can visualize key configuration changes that correlate with changes in metrics and traces to rapidly identify the root causes of problems. Then you can start fixing it.  

A service mesh also makes it easy to establish and monitor SLOs and SLIs (service level objectives and indicators) so you can prioritize critical fixes and easily share system status with the team. With a service mesh, platform owners can more quickly identify the root cause and gain a better understanding of their environments. This enables more resilient architecting in the future, to prevent outages occurring at all. 

I Need Everything to Be Secure

Security is table stakes.  They may say that all publicity is good publicity, but you do not want to be in the news for a security breach.  It's an existential threat to your business and your career. Defense in depth is the way to go, and a service mesh provides a powerful set of security tools to accomplish this.

Firstly, you can encrypt all the traffic moving among your microservices with mutual TLS that’s easy to set up and manage.  That way, should an attacker compromise your perimeter defenses, they’ll find it difficult to do much in a fully encrypted environment. Client and server side encryption ensures common microservices vulnerabilities such as man-in-the-middle attacks are prevented.

Secondly, don't just monitor all the traffic that's entering and leaving your microservices environment. You need to also implement fine-grained RBAC that makes the principle of least privilege a reality in your environment.  Admission controls and secure ingress allow platform operators to ensure that developers are following secure and compliant practices, and also make it easy for applications to communicate securely to the internet.   

And thirdly, take advantage of the observability that a service mesh provides into the security posture of your microservices, so that you have confidence everything is operating as expected.

If you’re working for a company that’s trying to walk the razor’s edge of agility with stability, check out service mesh.  It provides a powerful set of tools that help you operate a platform where application developers move quickly, while relying on the platform to solve the observability, traffic management and security challenges that come with a modern microservices environment.


Harnessing Microservices Uncertainty

Harnessing the Power of Microservices to Overcome an Uncertain Marketplace

According to PwC’s 23rd Annual Global CEO Survey, the outlook for 2020 can be summarized in one word—uncertainty. According to the survey, only 27% of CEOs are “very confident” in their prospects for revenue growth in 2020, a low level not seen since 2009. As organizations navigate their way through digital transformations, how can they leverage strategies and applications that help them overcome uncertainty, rather than causing more?

In this article, we talk with Shawn Wormke, Incubation Lead at Aspen Mesh, about how a service mesh can help companies achieve agility with stability in order to overcome uncertainty for the future and get ahead in the marketplace.  

Q: Many CEOs are feeling a sense of uncertainty. More than half of the CEOs surveyed by PwC believe the rate of global GDP growth will decline in 2020. With this concern top of mind, how can applications like service mesh drive value for organizations by addressing security challenges, skills gaps and increased complexity at scale?

A: In times of uncertainty, the potential for flat to decreasing revenues adds intense pressure for CEOs to maintain and grow their business. One of the ways they can overcome this challenge is to focus on becoming more agile and delivering more value to their customers — faster than their competitors — with new products and services. Almost every company has embraced Agile workflows and is adopting cloud and container technologies like Kubernetes to enable this, but these new complex and distributed technologies come with their own set of challenges around operations and security.

One of the patterns for addressing those challenges is service mesh, and I believe it should be at the center of a company’s approach to microservice architectures. Because service meshes, like Istio and Aspen Mesh, are inserted next to every microservice, it enables a strategic point of control for operating and securing next-generation application architectures. By moving critical operations like encryption, certificate management, policy, traffic steering, high-availability, logging and tracing out of the application and into the platform — where it belongs — you can ensure that you have your human capital adding application value rather than managing infrastructure. 

Q: What cloud native technologies and enterprise architecture modernization strategies are you seeing organizations leverage in order to thrive in a quickly evolving marketplace?

A: Microservices architectures, and the container technologies like Kubernetes that enable them, have fundamentally changed the way applications are delivered and managed. These new patterns allow companies to efficiently scale both software and engineering, reduce risk, and improve the overall quality of applications.

These technologies are new and they present challenges like most nascent technologies. But, with the amazing work of open source projects and communities like Istio, Prometheus, Jaeger, Grafana and many others, there are solutions available to help overcome these challenges.

Q: A business’s agility is what allows them to rapidly grow its revenue streams, respond to customer needs and defend against disruption. But is that enough?

A: Agility is a company’s number one business advantage. It is a business’s agility that allows them to rapidly grow revenue, respond to customer needs, and defend against disruption. It is the need for agility that drives digital transformations and causes organizations to define new ways of working, develop new application architectures, and embrace cloud and container technologies.

However, agility with stability is the critical competitive advantage. Companies that can meet evolving customer needs — while staying out of the news for downtime and security breaches — will be the winners of tomorrow.

Q: How exactly does a service mesh get you agility with stability?

A: Because a service mesh sits below the application and above the network — and has a control plane that can consume and apply policy — it enables development and platform teams to work together while focusing on their specific areas of expertise so companies can deliver solutions to their customers that are performant, secure and compliant. 

In addition, a service mesh handles encryption and provides visibility into an application's behavior like no other technology. It can see the “final product” of the application or service as an outsider and provides a unique perspective on that service’s behavior, performance and communication patterns. This allows operators to operate, manage and scale those services based on their actual needs and the end users’ experience.

Q: Service mesh is a useful tool, but when does it make sense for an organization to consider adopting one?

A: It is true that the service mesh pattern has been around for a few years, but it is still an early market for the technology and surrounding products. Specifically at Aspen Mesh, we have been working with customers in this area for over two years and realize that each organization is different in their maturity and needs when it comes to Kubernetes and microservices. A company may adopt service mesh early in order to meet compliance needs. Some organizations run into challenges in production and need visibility, while others may need to reduce errors caused by lack of engineering expertise on their development teams.

In general, you probably need a service mesh if you can no longer draw your microservices topology on a sheet of paper, or hold it in your head. This usually happens for our customers at about 15-20 microservices.

Q: What are some of the top use cases you are seeing service mesh used for?

A: Kubernetes and containers is a journey for most organizations. Along that journey, road-blocks will be encountered that must be addressed. The most common path for the organizations we talk to involves:

  • Understanding what services they have deployed and how they are communicating,
  • How they can make their platform and applications comply with either company or regulatory requirements, and
  • How they ensure they are providing the best possible user experience and reducing downtime.

Therefore, the most common use cases we see our customers implementing a service mesh for include visibility, observability, and encryption of service-to-service communication. More recently, we've seen adoption increase for operational benefits that allow them to quickly identify, diagnose and resolve customer-impacting problems.

Q: What do you see for the future of service mesh? How will it help organizations overcome the challenges associated with uncertainty?

A: Service mesh is a new frontier, and despite all the recent attention, is still a nascent market and pattern. This is due to its strategic point of control in application architectures and its ability to operate in a transparent and distributed manner. More and more companies, as they move from proof-of-concept to production with their new application architectures, will come to rely on a service mesh to provide a consistent layer in which they are able to control and manage their services while ensuring that all applications are optimally performing and meeting compliance and security requirements.

As service meshes mature, they will become a critical piece of infrastructure that enables organizations to maximize their true competitive advantage of agility with stability.

If you’re scaling microservices on Kubernetes, it's worth considering a service mesh to help you get the most out of your distributed systems. To learn more about service mesh, feel free to reach out to our team to schedule a time to meet.


Aspen Mesh digital transformation service mesh

Digital Transformation: How Service Mesh Can Help

Your Company’s Digital Transformation

It’s happening everywhere, and it’s happening fast. In order to meet consumers head on in the best, most secure ways, enterprises are jumping on the digital transformation train (check out this Forrester report). 

Several years ago, digital transformations saw companies moving from monolithic architectures towards microservices and Kubernetes, but service mesh was in its infancy. No one knew they'd need something to help manage service-to-service communication. Now, with increasing complexity and demands coupled with thinly-stretched resources or teams without service mesh expertise, supported service mesh is becoming a good solution for many--especially for DevOps teams.

Service Mesh for DevOps

"DevOps" is a term used to describe the business relationship between development and IT operations. Mostly, the term is used when referring to improving communication and collaboration between the two teams. But while Dev is responsible for creating new functionality to drive business, Ops is often the unsung--but extremely important--hero behind the scenes. In IT Ops, you’re on the hook for strategy development, system design and performance, quality control, direction and coordination of your team all while collaborating with the Dev team and other internal stakeholders to achieve your business’s goals and drive profitability. Ultimately, it’s the Dev and Ops teams who are responsibility to ensure that teams are communicating effectively, systems are monitored correctly, high customer satisfaction is achieved and projects and issue resolution are completed on time. A service mesh can help with this by enabling DevOps.

Integrating a Service Mesh: Align with Business Objectives

As you think about adopting a service mesh, keep in mind that your success over time is largely dependent on aligning with your company’s business objectives. Sharing business objectives like these with your service mesh team will help to ensure you get--and keep--the features and capabilities that you really need, when you need them, and that they stay relevant.

What are some of your company’s business objectives? Here are three we’ve identified that a service mesh can help to streamline:

1. Automating More Process (i.e. Removing Toil)
Automating processes frees up your team from mundane tasks so they can focus on more important projects. Automation can save you time and money.

2. Increasing Infrastructure Performance
Building and maintaining a battle-tested environment is key to your end users experience, and therefore churn or customer retention rates and your company’s bottom line.

In addition, much of your time is spent developing strategies to monitor your systems and working through issue resolution as quickly as possible--whether issues pop up during the workday, or in the middle of the night. Fortunately, because service mesh come with observability, security and resilience features, it can help alleviate these responsibilities, decreasing MTTD and MTTR.

3. Maintaining Delivery to Customers
Reducing friction in the user experience is the name of the game these days, so UX and reliability are key to keeping your end users happy. If you’re looking at a service mesh, you’re already using a microservices architecture, and you’re likely using Kubernetes clusters. But once those become too complex in production--or don’t have all the features you need-- it’s time to add a service mesh into the mix. Service mesh’s observability features like cluster health monitoring, service traffic monitoring, easy debugging and root cause identification with distributed tracing help with this. In addition, an intuitive UI is key to surfacing these features in a way that is easy to understand and manipulate, so make sure you’re looking at a service mesh that’s easy for your Dev team to use. This will help provide a more seamless (and secure) experience for your end users.

Evolution; Not Revolution

How do you actually go about approaching the process of integrating a service mesh? What will drive success is for you to have agility and stability. But that can be a tall order, so it can be helpful to approach integrating a service mesh as evolution, rather than revolution. Three key areas to consider while you’re evaluating a service mesh include:

  1. Mitigating risk
  2. Production readiness
  3. Policy frameworks

Mitigating Risk
Risk can be terrifying, so it’s imperative to take steps to ensure that risk is mitigated as much as possible. The only time your company should be making headlines is because of good news. Ensuring security, compliance, and data integrity is the way to go. With security and compliance at top of mind for many, it’s important to address security head on. 

With a well-designed enterprise service mesh, you can expect plenty of security, compliance and policy features so it’s easy for your company to get a zero-trust network. Features can include anything from ensuring the principle of least privilege and secure default settings to technical features such as fine-grained RBAC and incremental mTLS.

Production Readiness
Your applications are ready to be used by your end users, and your technology stack needs to be ready too. What makes a real impact here is reliability. Service mesh features like dynamic request routing, fast retries, configuration vetters, circuit breaking and load balancing greatly increase the resiliency of microservice architectures. Support is also a feature that some enterprises will want to consider in light of whether service mesh expertise is a core in-house skill for the business. Having access to an expert support team can make a tremendous difference in your production readiness and your end users’ experiences.

Policy Frameworks
While configuration is useful for setting up how a system operates, policy is useful in dictating how a system responds when something happens. With a service mesh, the power of policy and configuration combined provides capabilities that can drive outcome-based behavior from your applications. A policy catalog can accelerate this behavior, while analytics examines policy violations and understands the best actions to take. This improves developer productivity with canary, authorization and service availability policies.

How to Measure Service Mesh Success

No plan is complete without a way to measure, iterate and improve your success over time. So how do you go about measuring the success of your service mesh? There are a lot of factors to take into consideration, so it’s a good idea to talk to your service mesh provider in order to leverage their expertise. But in the meantime, there are a few things you can consider to get an idea of how well your service mesh is working for you. Start by finding a good way to measure 1) how your security and compliance is impacted, 2)  how much you’re able to change downtime and 3) differences you see in your efficiency.

Looking for more specific questions to ask? Check out the eBook, Getting the Most Out of Your Service Mesh for ideas on the right questions to ask and what to measure for success.


Service Mesh Landscape - Aspen Mesh

The Service Mesh Landscape

Where A Service Mesh Fits in the Landscape

Service mesh is helping to take the cloud native and open source communities to the next level, and we’re starting to see increased adoption across many types of companies -- from start-ups to the enterprise. 

For any company, while a service mesh overlaps, complements, and in some cases replaces many tools that are commonly used to manage microservices, many technologies are involved in the service mesh landscape. In the following, we've explained some ways that a service mesh fits with other commonly used container tools.

Service Mesh Landscape - Aspen Mesh

Container Orchestration

Kubernetes provides scheduling, auto-scaling and automation functionality that solves most of the build and deploy challenges that come with containers. Where it leaves off, and where service mesh steps in, is solving some critical runtime challenges with containerized applications. A service mesh adds uniform metrics, distributed tracing, encryption between services and fine-grained observability of how your cluster is behaving at runtime. Read more about why container orchestration and service mesh are critical for cloud native deployments

API Gateway

The main purpose of an API gateway is to accept traffic from outside your network and distribute it internally. The main purpose of a service mesh is to route and manage traffic within your network. A service mesh can work with an API gateway to efficiently accept external traffic then effectively route that traffic once it’s in your network. There is some nuance in the problems solved at the edge with an API Gateway compared to service-to-service communication problems a service mesh solves within a cluster. But with the evolution of cluster-deployment patterns, these nuances are becoming less important. If you want to do billing, you’ll want to keep your API Gateway. But if you’re focused on routing and authentication, you can likely replace an API gateway with service mesh. Read more on how API gateways and service meshes overlap.

Global ADC

Load balancers focus on distributing workloads throughout the network and ensuring the availability of applications and services. Load balancers have evolved into Application Delivery Controllers (ADCs) that are platforms for application delivery, ensuring that an organization’s critical applications are highly available and secure. While basic load balancing remains the foundation of application delivery, modern ADCs offer much more enhanced functionality such as SSL/TLS offload, caching, compression, rate-shaping, intrusion detection, application firewalls and remote access into a single strategic point. A service mesh provides basic load balancing, but if you need advanced capabilities such as SSL/TLS offload and rate-shaping you should consider pairing an ADC with service mesh.

mTLS

Service mesh provides defense with mutual TLS encryption of the traffic between your services. The mesh can automatically encrypt and decrypt requests and responses, removing that burden from the application developer. It can also improve performance by prioritizing the reuse of existing, persistent connections, reducing the need for the computationally expensive creation of new ones. Aspen Mesh provides more than just client server authentication and authorization, it allows you to understand and enforce how your services are communicating and prove it cryptographically. It automates the delivery of the certificates and keys to the services, the proxies use them to encrypt the traffic (providing mutual TLS), and periodically rotates certificates to reduce exposure to compromise. You can use TLS to ensure that Aspen Mesh instances can verify that they’re talking to other Aspen Mesh instances to prevent man-in-the-middle attacks.

CI/CD

Modern Enterprises manage their applications via an agile, iterative lifecycle model.  Continuous Integration and Continuous Deployment systems automate the build, test, deploy and upgrade stages.  Service Mesh adds power to your CI/CD systems, allowing operators to build fine-grained deployment models like canary, A/B, automated dev/stage/prod promotion, and rollback.  Doing this in the service mesh layer means the same models are available to every app in the enterprise without app modification. You can also up-level your CI testing using techniques like traffic mirroring and fault injection to expose every app to complicated, hard-to-simulate fault patterns before you encounter them with real users.

Credential Management 

We live in an API economy, and machine-to-machine communication needs to be secure.  Microservices have credentials to authenticate themselves and other microservices via TLS, and often also have app-layer credentials to serve as clients of external APIs. It’s tempting to focus only on the cost of initially configuring these credentials, but don’t forget the lifecycle – rotation, auditing, revocation, responding to CVEs. Centralizing these credentials in the service mesh layer reduces scope and improves the security posture.

APM

Traditional Application Performance Monitoring tools provide a dashboard that surfaces data that allow users to monitor their applications in one place. A service mesh takes this one step further by providing observability. Monitoring is aimed at reporting the overall health of systems, so is best limited to key business and systems metrics derived from time-series based instrumentation. Observability focuses on providing highly granular insights into the behavior of systems along with rich context, perfect for debugging purposes. Aspen Mesh provides deep observability that allows you to understand current state of your system, and also provide a way to better understand system performance and behavior, even during the what can be perceived as normal operation of a system. Read more about the importance of observability in distributed systems.

Serverless

Serverless computing transforms source code into running workloads that execute only when called. The key difference between service mesh and serverless is that with serverless, a service can be scaled down to 0 instances if the system detects that it is not being used, thus saving you from the cost of continually having at least one instance running. Serverless can help organizations reduce infrastructure costs, while allowing developers to focus on writing features and delivering business value. If you’ve been paying attention to service mesh, these advantages will sound familiar. The goals with service mesh and serverless are largely the same – remove the burden of managing infrastructure from developers so they can spend more time adding business value. Read more about service mesh and serverless computing.

Learn More

If you'd like to learn more about how a service mesh can help you and your company, schedule a time to talk with one of our experts, or take a look at The Complete Guide to Service Mesh.


When Do You Need A Service Mesh - Aspen Mesh

When Do You Need A Service Mesh?

When You Need A Service Mesh - Aspen MeshOne of the questions I often hear is: "Do I really need a service mesh?" The honest answer is "It depends." Like nearly everything in the technology space (or more broadly "nearly everything"), this depends on the benefits and costs. But after having helped users progress from exploration to production deployments in many different scenarios, I'm here to share my perspective on which inputs to include in your decision-making process.

A service mesh provides a consistent way to connect, secure and observe microservices. Most service meshes are tightly integrated with an orchestration platform, commonly Kubernetes. There's no way around it; a service mesh is another thing, and at least part of your team will have to learn it. That's a cost, and you should compare that cost to the benefits of operational simplification you may achieve.

But apart from costs and benefits, what should you be asking in order to determine if you really need a service mesh? The number of microservices you’re running, as well as urgency and timing, can have an impact on your needs.

How Many Microservices?

If you're deploying your first or second microservice, I think it is just fine to not have a service mesh. You should, instead, focus on learning Kubernetes and factoring stateless containers out of your applications first. You will naturally build familiarity with the problems that a service mesh can solve, and that will make you much better prepared to plan your service mesh journey when the time comes.

If you have an existing application architecture that provides the observability, security and resilience that you need, then you are already in a good place. For you, the question becomes when to add a service mesh. We usually see organizations notice the toil associated with utility code to integrate each new microservice. Once that toil gets painful enough, they evaluate how they could make that integration more efficient. We advocate using a service mesh to reduce this toil.

The exact point at which service mesh benefits clearly outweigh costs varies from organization to organization. In my experience, teams often realize they need a consistent approach once they have five or six microservices. However, many users push to a dozen or more microservices before they notice the increasing cost of utility code and the increasing complexity of slight differences across their applications. And, of course, some organizations continue scaling and never choose a service mesh at all, investing in application libraries and tooling instead. On the other hand, we also work with early birds that want to get ahead of the rising complexity wave and introduce service mesh before they've got half-a-dozen microservices. But the number of microservices you have isn’t the only part to consider. You’ll also want to consider urgency and timing. 

Urgency and Timing

Another part of the answer to “When do I need a service mesh?” includes your timing. The urgency of considering a service mesh depends on your organization’s challenges and goals, but can also be considered by your current process or state of operations. Here are some states that may reduce or eliminate your urgency to use a service mesh:

  1. Your microservices are all written in one language ("monoglot") by developers in your organization, building from a common framework.
  2. Your organization dedicates engineers to building and maintaining org-specific tooling and instrumentation that's automatically built into every new microservice.
  3. You have a partially or totally monolithic architecture where application logic is built into one or two containers instead of several.
  4. You release or upgrade all-at-once after a manual integration process.
  5. You use application protocols that are not served by existing service meshes (so usually not HTTP, HTTP/2, gRPC).

On the other hand, here are some signals that you will need a service mesh and may want to start evaluating or adopting early:

  1. You have microservices written in many different languages that may not follow a common architectural pattern or framework (or you're in the middle of a language/framework migration).
  2. You're integrating third-party code or interoperating with teams that are a bit more distant (for example, across a partnership or M&A boundary) and you want a common foundation to build on.
  3. Your organization keeps "re-solving" problems, especially in the utility code (my favorite example: certificate rotation, while important, is no scrum team's favorite story in the backlog).
  4. You have robust security, compliance or auditability requirements that span services.
  5. Your teams spend more time localizing or understanding a problem than fixing it.

I consider this last point the three-alarm fire that you need a service mesh, and it's a good way to return to the quest for simplification. When an application is failing to deliver a quality experience to its users, how does your team resolve it? We work with organizations that report that finding the problem is often the hardest and most expensive part. 

What Next?

Once you've localized the problem, can you alleviate or resolve it? It's a painful situation if the only fix is to develop new code or rebuild containers under pressure. That's where you see the benefit from keeping resiliency capabilities independent of the business logic (like in a service mesh).

If this story is familiar to you, you may need a service mesh right now. If you're getting by with your existing approach, that’s great. Just keep in mind the costs and benefits of what you’re working with, and keep asking:

  1. Is what you have right now really enough, or are spending too much time trying to find problems instead of developing and providing value for your customers?
  2. Are your operations working well with the number of microservices you have, or is it time to simplify?
  3. Do you have critical problems that a service mesh would address?

Keeping tabs on the answers to these questions will help you determine if — and when — you really need a service mesh.

In the meantime if you're interested in learning more about service mesh, check out The Complete Guide to Service Mesh.


Aspen Mesh - Service Mesh Security and Complinace

Aspen Mesh 1.4.4 & 1.3.8 Security Update

Aspen Mesh is announcing the release of 1.4.4 and 1.3.8 (based on upstream Istio 1.4.4 & 1.3.8), both addressing a critical security update. All our customers are strongly encouraged to upgrade to these versions immediately based on your currently deployed Aspen Mesh version.

The Aspen Mesh team reported this CVE to the Istio community as per Istio CVE reporting guidelines, and our team was able to further contribute to the community--and for all Istio users--by fixing this issue in upstream Istio. As this CVE is extremely easy to exploit and the risk score was deemed very high (9.0), we wanted to respond  with urgency by getting a patch in upstream Istio and out to our customers as quickly as possible. This is just one of the many ways we are able to provide value to our customers as a trusted partner by ensuring that the pieces from Istio (and Aspen Mesh) are secure and stable.

Below are details about the CVE-2020-8595, steps to verify whether you’re currently vulnerable, and how to upgrade to the patched releases.

CVE Description

A bug in Istio's Authentication Policy exact path matching logic allows unauthorized access to resources without a valid JWT token. This bug affects all versions of Istio (and Aspen Mesh) that support JWT Authentication Policy with path based trigger rules (all 1.3 & 1.4 releases). The logic for the exact path match in the Istio JWT filter includes query strings or fragments instead of stripping them off before matching. This means attackers can bypass the JWT validation by appending “?” or “#” characters after the protected paths.

Example Vulnerable Istio Configuration

Configure JWT Authentication Policy triggers on an exact HTTP path match "/productpage" like this. In this example, the Authentication Policy is applied at the ingress gateway service so that any requests with the exact path matching “/productpage” requires a valid JWT token. In the absence of a valid JWT token, the request is denied and never forwarded to the productpage service. However, due to this CVE, any request made to the ingress gateway with path "/productpage?" without any valid JWT token is not denied but sent along to the productpage service, thereby allowing access to a protected resource without a valid JWT token. 

Validation

Since this vulnerability is in the Envoy filter added by Istio, you can also check the proxy image you’re using in the cluster locally to see if you’re currently vulnerable. Download and run this script to verify if the proxy image you’re using is vulnerable. Thanks to Francois Pesce from the Google Istio team in helping us create this test script.

Upgrading to This Version

Please follow upgrade instructions in our documentation to upgrade to these versions. Since this vulnerability affects the sidecar proxies and gateways (ingress and egress), it is important to follow the post-upgrade tasks here and rolling upgrade all of your sidecar proxies and gateways.