Aspen Mesh 1.6 Service Mesh

Announcing Aspen Mesh 1.6

We’re excited to announce the release of Aspen Mesh 1.6 which is based on Istio’s release 1.6 (specific tag 1.6.5). As a release manager for Istio 1.6, I’ve been eager for Aspen Mesh’s adoption of 1.6 as Istio continues its trend of adding many enhancements and improvements. Our commitment and relationship with the Istio community continues to flourish as our co-founder and Chief Architect Neeraj Poddar was recently appointed to the Technical Oversight Committee and I joined the Product Security Working Group, a group tasked with handling sensitive security issues within Istio. With our team members joining these groups, you can be assured that your best interests are represented as we continue to develop Aspen Mesh.

As with every new major release, we’re excited to detail the new features and capabilities offered by Istio, and also new features available within Aspen Mesh. 

Hare are some key items to note for this new release:

Helm Based Installs

At Aspen Mesh, we encourage users to adopt the GitOps workflow using Helm. Istio is moving towards CLI tool based install using istioctl and the Istio Operator for installations and upgrades. While the chart structure located in the manifests/ directory shipped in Istio works with Helm, we’ve spent considerable effort in streamlining the charts to make them ready for the enterprise to ensure continuity.

However, upgrading to 1.6 will be drastic due to the structural changes made to the charts in Istio. Given the efforts we are putting into Istio, as well as the burden this places on users, our intent is to streamline this before the release of Aspen Mesh 1.7 to ease the upgrade process moving forward.

Istiod Monolith

With the change to Helm charts, users will be able to leverage Istiod and Telemetry v2. Istiod is the consolidated Istio controlplane, delivered as a monolithic deployment (with the exception of Mixer). There are two key reasons we are eager to support this consolidated deployment:

  1. It significantly reduces the memory and CPU footprint of the service mesh, resulting in lower operating costs.
  2. The simplified deployment model makes it easier for operators to debug production issues when the need arises. It’s no longer necessary to look at the logs of different services to determine the root cause of a problem, but rather just your Istiod pods.

Telemetry v2

As of Aspen Mesh 1.6, we only support Telemetry v2, also known as Mixerless telemetry. While Telemetry v2 does not have parity with Mixer, the benefits of Telemetry v2 now far outweigh the features no longer in Mixer. Don’t be alarmed as the Istio community is diligently working on having Telemetry v2 reach parity with Mixer.

Many of our users have reported Mixer related performance issues, such as high CPU load, high memory usage and even latency issues. These issues should be solved with the move towards Envoy-based filters, such as the WASM filter used by Telemetry v2. In-band and in-application, high performance C++ code should better meet the needs of large enterprises with hundreds of nodes and thousands of pods.

SDS: The Default Behavior Across Your Service Mesh

In Aspen Mesh 1.5, Secret Discovery Service (SDS) was not enabled by default for sidecar proxies across your cluster. With the Aspen Mesh 1.6 release, both gateways and workloads support SDS, allowing for better service mesh security as well as performance improvements.

For reference, beginning with Aspen Mesh 1.6, an executable istio-agent lives alongside Envoy sidecar proxy and safely shares certificates issued to it by Istiod with Envoy. This is a change from Aspen Mesh, where 1.5 Kubernetes Secrets were created by Citadel, presenting risks if the Kubernetes cluster wasn’t properly secured. One of the top benefits of SDS is that it allows Envoy to be hot-restarted when certificates are set to expire and need to be rotated.

Next Steps for You

Aspen Mesh 1.6 is available for download here. You can look at the complete list of changes in this release by visiting our release notes. If you have any questions about your upgrade path, feel free to reach out to us at

If you’re new to Aspen Mesh, you can download a free 30-day trial

Announcing Aspen Mesh 1.5

Announcing Aspen Mesh 1.5

We’re excited to announce the release of Aspen Mesh 1.5 which is based on Istio’s latest LTS release 1.5 (specific tag 1.5.2). Istio 1.5 is the most feature-rich and stable release since Istio 1.1 and contains many enhancements and improvements that we are excited about. With this release, Istio continues to increase stability, performance and usability. At Aspen Mesh, we’re proud of the work that the Istio community has made and are committed to helping it become even better. We continue to represent users’ interests and improve the istio project as maintainers and working group leads for key areas, including Networking and Product Security. It has been amazing to see the growth of the Istio community and we look forward to being a part of its inclusive and diverse culture.

Given the changes in the release, we understand that it can be challenging for users to focus on their key issues and discover how the new features and capabilities relate to their needs while adding business value. We hope this blog will make it easier for our users to digest the sheer volume of changes in Istio 1.5 and understand the reasons why we have, at times, chosen to disable or not enable certain capabilities. We evaluate features and capabilities within Istio from release to release in terms of stability, readiness and backwards compatibility, and we’re eager to enable all capabilities once they are ready for enterprise consumption.

Key updates in Istio 1.5 include: 

  • The networking APIs were moved from v1alpha1 to v1beta1 showing that they are getting closer to stabilizing; 
  • Enabling Envoy to support filters written in WebAssembly (WASM); 
  • The istio operator continues to make major strides in installing and upgrading Istio, as well as istioctl; 
  • A new API model to secure mutual TLS (mTLS) and JWT use in your service mesh.

The number of features added or changed by Istio is long and exciting, so let’s dig deeper into the changes, rationale behind them and how it will impact you as an Aspen Mesh customer.

New Security APIs

Istio 1.5 introduces new security APIs (PeerAuthentication, RequestAuthentication) that work in conjunction with Authorization Policy resources to create a stronger security posture for your applications. We understand that customers have widely adopted Authentication Policies, and the pain of incrementally adopting mTLS as well as debugging issues encountered. We believe that the new set of APIs are a step forward in the evolution of application security. With the older APIs being deprecated in Istio 1.5— and with the intent to remove these APIs in Istio 1.6—it is imperative that customers understand the work required to ease this transition.

To ease migration from the v1alpha1 APIs to the v1beta1 APIs Aspen Mesh has updated our dashboard to reflect mTLS status using the new APIs, we have updated our Secure Ingress API to use the new security APIs and we have updated our configuration generation tool to help you incrementally adopt the new APIs.

Challenges with Older APIs

While Authentication Policies worked for most of the use cases, there were architectural issues with the API choices.

Authentication Policies used to be specified at the service level, but were applied at the workload level. This often caused confusion and potential policy escapes as all other Istio resources are specified and applied at the workload level. The new API addresses this concern and is consistent with the Istio configuration model.

Authentication Policies used to handle both end user and peer authentication causing confusion. The new API separates them out as they are different concerns that solve different sets of problems. This logical separation should make it easier to secure your application as the possibility of misconfiguration is reduced. For example, when configuring mTLS if there is a conflict between multiple PeerAuthentication resources, the strongest PeerAuthentication (by security enforcement) is applied. Previously conflict resolution was not defined and behavior wasn’t deterministic.

Authentication Policies used to validate end user JWT credentials and reject any request that triggered the rule and had no or invalid/expired JWT tokens. The new API separates the Authentication and Authorization pieces into two separate composable layers. This has the benefit of expressing all allow/deny policies in a single place, i.e. Authorization resources. The downside of this approach is that if a RequestAuthentication policy is applied, but no JWT token is presented, then the request is allowed unless there's a deny Authorization policy. However, separating authentication/identity and authorization follows traditional models of securing systems, with respect to access control. If you are currently using Authentication policies to reject requests without any JWT tokens, you will need to add a deny Authorization policy for achieving parity. While Istio has greatly simplified their security model, our Secure Ingress API provides a higher level abstraction for the typical use case. Ease of adoption is one of our core tenets and we believe that Secure Ingress does just that. Feel free to reach out to us with questions about how to do this.

Migration Towards New APIs

You don’t have to worry about resources defined by the existing Authentication Policy API as they will continue to work in Istio 1.5. Aspen Mesh supports the direction of the Istio community as it trends towards using workload selectors throughout its APIs. By adopting these new APIs we are hopeful that it will alleviate some application security concerns. 

With the removal of the v1alpha1 APIs in istio 1.6 (ETA June 2020) we are focused on helping our customers migrate towards using the new APIs. Our Secure Ingress Policy has already been ported to support the new APIs, as well as our tool to help you incrementally adopt mTLS. Note that our Secure Ingress controller preserves backward compatibility by creating both RequestAuthentication and Authorization policies as needed to ensure paths protected by JWT tokens cannot be accessed without tokens.

Users upgrading to Aspen Mesh 1.5.2 will see their MeshPolicy replaced with a mesh-wide PeerAuthentication resource. This was intentional, although upstream Istio chose not to remove a previously installed MeshPolicy, so that users can quickly migrate their cluster towards using the new security APIs. With that in mind, istioctl now includes analyzers to help you find and manage your MeshPolicy/Policy resources, giving you a deprecation message with instructions to migrate away from your existing policies.

Extending Envoy Through WASM

WASM support, as an idea, can be traced back to October 2018. A proxy ABI specification for WASM was also drafted with the idea that all proxies, and not just Envoy, can adopt WASM helping to further cement the Istio community as thought leaders in the service mesh industry. The Istio community also had major contributions to Envoy through the envoy-wasm repository to enable WASM support and is the fork of Envoy used in Istio 1.5. There are plans to have their hard work incorporated into the mainline Envoy repository.

This is a major feature as it will enable users to extend Envoy in new and exciting ways.

We applaud the Istio community for its work to deliver this capability. Previously writing a filter for Envoy required extensive knowledge of not only Envoy C++ code, but also how to build it, integrate it and deploy Envoy.

WebAssembly support in Envoy will allow users to create, share and extend Envoy to meet their needs. Support for creating Envoy extensions already exists with SDKs for C++, Rust, and AssemblyScript, and there are plans to increase the number of supported languages shortly. Istio is also continuing to build its supporting infrastructure around these Envoy features to easily integrate them into the Istio platform. There is also on-going work to move already existing Istio filters to WASM; in fact, Telemetry V2 is a WASM extension. Istio is eating its own dog food and the results speak for themselves. Expect future blogs around WASM capabilities in Istio as we learn more about its feature set and capabilities.

For more information, please see Istio’s WASM announcement

istiod and Helm Based Installs

Support for Helm v3-based installs continues to be a top priority for Aspen Mesh. Aspen Mesh Helm installation process is a seamless operation for your organization. Aspen Mesh 1.5 does not support the istiod monolith yet, but continues to ship with Istio’s microservice platform.

We are actively working with the Istio community to enable installations and upgrades from a microservice platform to the monolithic platform a first-class experience with minimal downtime in istio 1.6. This will ensure that our customers will have a stable and qualified release when they migrate to istiod.

As Istio has evolved, a decision was made to simplify Istio and transform its microservice platform to that of a monolith for operational simplification and overhead reduction. The upgrade path from microservices control plane to monolith (istiod) requires a fresh install and possible downtime. We want our customers to have a smooth upgrade and experience minimal downtime disruption.

Moving Towards a Simplified Operational System (istiod)

Benefits of a monolithic approach to end users, include a reduction in installation complexity and fewer services deployed which makes it easier for users to understand what is being deployed in their Kubernetes environment, especially as the intra-communication within the controlplane and the inter-communication between the control plane and data plane continues to grow in complexity. 

With the migration towards istiod, one of the key features that we anticipate will mature in istio 1.6 is the istio in-cluster Operator. In-cluster operators have quickly gained traction within the Kubernetes community and the Istio community has made a significant effort towards managing the lifecycle of Istio. A key feature that we are eager about is the ability to have multiple control planes, which would enable such things as canaries and potentially in-place upgrades. As the in-cluster operator continues to evolve we will continue to evaluate it and when we feel it is ready for enterprise use you should expect a more detailed blog to follow.

Other Features on the Horizon

As always, Istio and Aspen Mesh are constantly moving forward. Here is a preview of some new features we see on the horizon.

Mixer-less Telemetry

Many users have inquired about mixer-less telemetry, hence referred to as Telemetry V2. During testing, Aspen Mesh found an issue with TCP blackhole telemetry, others found an issue with HTTP blackhole telemetry and a few other minor issues were found. Aspen Mesh is committed to helping the Istio community test and fix issues found in Telemetry V2, but until feature parity is reached with Mixer telemetry we will continue to ship with Mixer, aka Telemetry V1. This is perhaps our most requested feature and we are eager to enable it. The work surrounding this has been a Herculean effort by the Istio community and we are eager to enable it in a future Aspen Mesh release.

Intelligent Inbound and Outbound Protocol Detection

While upstream Istio has previously enabled outbound intelligent protocol detection, Aspen Mesh has decided to disable it; the reasoning for this was detailed in our Protocol sniffing in production blog. For Istio 1.5, inbound intelligent protocol detection was also enabled. While this has the potential to be a powerful feature, Aspen Mesh believes that deterministic behavior is best defined by our end users. There have been a handful of security exploits related to protocol sniffing and we will continue to monitor its stability and performance implications and accordingly enable them if warranted in future releases.

For this and prior releases, we recommend our users continue to prefix the port name with the appropriate supported protocol as suggested here. You can also use our vetter to discover any missing port prefixes in your service mesh.

Next Steps for You

The Aspen Mesh 1.5 binaries are available for download here. If you have any questions about your upgrade path, feel free to reach out to us at

If you’re new to Aspen Mesh, you can download a free 30-day trial here

Announcing Aspen Mesh 1.4 Service Mesh

Announcing Aspen Mesh 1.4

We’re excited to announce the release of Aspen Mesh 1.4 which is based on Istio’s latest LTS release 1.4 (specific tag 1.4.3). The Aspen Mesh 1.4.3 release builds upon our latest 1.3 release along with our new Secure Ingress API, that we will be writing about in detail in a blog soon to follow, alongside the new capabilities of Istio 1.4.

Our Open Source Commitment

Aspen Mesh has been heavily involved with the Istio community for this release. A core value of Aspen Mesh is to help drive adoption of Istio. Our Istio Vet tool was the the first tool created to help analyze configuration issues and it is great to have this capability within the core open source project by using istioctl analyze. Aspen Mesh is committed to helping our customers and the larger Istio community as we will continue to add analyzers in the future based on issues that arise.

For 1.4, we are also pleased that we could help the community create istio-client-go, allowing the open source community to programmatically use and extend the Istio APIs in their software. We have deprecated the widely adopted Aspen Mesh version of our istio-client-go in favor of the the community’s version.

Automatic mTLS

Automatic mTLS is a feature that Aspen Mesh is eager about enabling in the future, but for now is disabled in upstream Istio as it is in our supported release. We are strong advocates of using mTLS for service-to-service communication, but users adopting Istio will sometimes encounter issues and will need to debug mTLS in their service mesh.

Mixer-less Telemetry

Mixer-less telemetry moves from an experimental feature into alpha in Istio 1.4. Aspen Mesh is working with the open source community to ensure feature parity with Mixer during this state of transition, however, we continue to support Istio Mixer at this time. Please see our announcement of Aspen Mesh 1.3 to see the importance of this feature.

Istio Authorization Policies

With policy being a key reason why our customers are adopting Istio, the new Authorization Policy is a step towards simplifying Policy. Istio continues to improve their APIs to better achieve a simplified model of service mesh configuration. Stay tuned for future blogs as Aspen Mesh is a strong proponent of this model.


Along with the new analysis capabilities referenced above, istioctl also has the capability to help users with installation and upgrade along with a new experimental capability to wait for configuration to be passed to all proxies within your service mesh. At this time, we believe Helm provides the enterprise with the best way to manage deploying Istio into production, but we are always exploring and evaluating new approaches to managing your Istio deployment.

Analyzers included in Istio 1.4 help ensure that Gateways are configured properly, features that are being deprecated in Istio that are being used are given a warning message, and that all pods containing sidecars have the same, expected version of Envoy running. These are only some of the analyzers included and more information can be found here. Some of the work from Aspen Mesh’s Istio Vet was able to be ported to istioctl with the community’s help. We intend to continue porting any capabilities found in istio-vet into istioctl.

The Aspen Mesh 1.4 binaries are available for download here

How to Debug Istio Mutual TLS (mTLS) Policy Issues Using Aspen Mesh

Users Care About Secure Service to Service Communication

Mutual TLS (mTLS) communication between services is a key Istio feature driving adoption as applications do not have to be altered to support it. mTLS provides client and server side security for service to service communications, enabling organizations to enhance network security with reduced operational burden (e.g. certificate management is handled by Istio). If you are interested in learning more about this, checkout Istio's mTLS docs here. From regulatory concerns to auditing requirements and a host of other reasons businesses need to demonstrate they are following burgeoning security practices in a microservice landscape.

Many techniques evolved to help ease this requirement and enable businesses to focus on business value. Unfortunately, many of these techniques require expertise to ensure they are developed or configured properly, from IPSec to a wide range of other solutions. Unless you are a security expert, it is challenging to implement these techniques correctly. Managing ciphers, algorithms, rotating keys, certificates, and updating system libraries when CVEs are found is difficult for software developers, DevOps and sys admins to keep abreast of. Even seasoned security professionals can find it difficult to implement and audit such systems. As security is a core feature, this is where a service mesh like Aspen Mesh can help. A service mesh helps to alleviate these concerns with the goal of drastically lessening the burden of securing and auditing such systems, enabling users to focus on their core products.

Gradually Adopting mTLS Within Istio

At Aspen Mesh we recommend installing Istio with global mTLS enabled. However, very few deployments of Istio are in green-field environments where services are slowly adopted, created and can be monitored independently before new services are rolled out. In these cases, users will most likely adopt mTLS gradually service-by-service and will carefully monitor traffic behavior before proceeding to the next service.

A common problem that many users experience when enabling mTLS for service communication in their service mesh is inadvertently breaking traffic. A misconfigured AuthenticationPolicy or DestinationRule can affect communication unbeknownst to a user until other issues arise.

It is difficult to monitor for such specific failures because they occur at the transport layer (L4) where raw TCP connections are first established by the underlying OS after which the TLS handshake takes places. If a problem happens during this handshake the Envoy sidecar is not be able to create detailed diagnostic metrics and messaging as this error is not at the application layer (L7). While 503 errors can surface due to misconfiguration, a 503 alone is not specific enough to determine if the issue is due to misconfiguration or a misbehaving service. We are working with the Istio community to add telemetry for traffic failures related to mTLS misconfiguration. This requires surfacing the relevant information from Envoy which we are collaborating with in this pull request. Until such capability exists there are techniques and tools which we will discuss to aid you in debugging traffic management issues.

At Aspen Mesh we are trying to enable our users to feel confident in their ability to manage their infrastructure. Kubernetes, Istio and Aspen Mesh are the platform, but business value is derived from software written and configured in-house, so quickly resolving issues is paramount to our customer's success.

Debugging Policy Issues With Aspen Mesh

We will now walk-through debugging policy issues when using Aspen Mesh. Many of the following techniques are relevant to Istio in case you don't have Aspen Mesh installed.

In the below example, bookinfo was installed into the bookinfo namespace using Aspen Mesh with global mTLS set to PERMISSIVE. We then deployed three deployments that communicated with the productpage service spanning three different namespaces.

A namespace policy was created to set mTLS to be STRICT. However, no DestinationRules were created and as a result the system started to experience mTLS errors.

The graph suggests that there is a problem with traffic generator communicating with productpage. We will first inspect policy settings and logs of our services.

Determining the DestinationRule for a workload and an associated service is pretty straight-forward.

$ istioctl authn tls-check -n bookinfo traffic-generator-productpage-6b88d69f-xxfkn productpage.bookinfo.svc.cluster.local

where traffic-generator-productpage-6b88d69f-xxfkn is the name of a pod within the bookinfo namespace and productpage.bookinfo.svc.cluster.local is the server. The output will be similar to the following:

HOST:PORT                                       STATUS       SERVER     CLIENT     AUTHN POLICY         DESTINATION RULE
productpage.bookinfo.svc.cluster.local:9080     CONFLICT     mTLS       HTTP       default/bookinfo     destrule-productpage/bookinfo

if no conflict is found then the STATUS column will say OK, but for this example there is a conflict that exists between the AuthenticationPolicy and the DestinationRule. Inspecting the output closely, we see that there is a namespace wide AuthenticationPolicy used--determined by its name of default--and what appears, by name, to be a host-specific DestinationRule.

Using kubectl we can directly inspect the contents of the DestinationRule:

$ kubectl get destinationrule -n bookinfo destrule-productpage -o yaml
kind: DestinationRule
  annotations: |
  creationTimestamp: "2019-10-10T20:24:48Z"
  generation: 1
  name: destrule-productpage
  namespace: bookinfo
  resourceVersion: "4874298"
  selfLink: /apis/
  uid: 01612af7-eb9c-11e9-a719-06457fb661c2
  - '*'
  host: productpage
      mode: DISABLE

A conflict does exist and we can simply fix this by altering our DestinationRule to have a mode of ISTIO_MUTUAL instead of DISABLE. For this example it was a fairly simple fix. At times, however, it is possible that you may see a DestinationRule that is different from the one you expect. Reasoning about the correct DestinationRule object can be difficult without first knowing the resolution hierarchy established by Istio. In our example, the DestinationRule above also applies to the traffic-generator workloads in the other namespaces as well.

DestinationRule Hierarchy Resolution

When Istio configures the sidecars for service to service communication, it must make a determination on which DestinationRule, if any, should be used to handle communication between each service. When a client attempts to contact a server, the client's request is first routed to its sidecar and that sidecar inspects its configuration to determine the method by which it should establish a communication with the server's sidecar.

The rules by which Istio creates these sidecar configurations are as follows: clients first look for DestinationRules in their own namespace that match the FQDN of the requested server. If no DestinationRule is found then the server's namespace is checked for a DestinationRule; again, if no DestinationRule is found then the Istio root namespace (default is istio-system) is checked for a matching DestinationRule.

DestinationRules that use wildcards, specific ports and/or make use of exportTo can further make it arduous to determine DestinationRule resolution. Istio has set of guidelines to help users adopt rule changes found here.

It is also worthwhile to note that when a new DestinationRule is created to adhere to an AuthenticationPolicy change, it is important to keep any previously applied traffic rules, otherwise a behavioral change in service communication within your system may be experienced. For instance, if load balancing was previously LEAST_CONN for service to service communication due to a client namespace DestinationRule targeted for another namespace, then the new DestinationRule should inherit the load balancing setting, or the user will see a behavioral change in traffic patterns within their service mesh as load balancing for that service will use the default setting, ROUND_ROBIN.

Our product helps simplify this by respecting the rules set by Istio in 1.1.0+ and inspecting existing AuthenticationPolicies and DestinationRules when creating new ones.

Even so, it is best to use the fewest number of DestinationRules possible in a service mesh. While it is an incredibly powerful feature, it's best used with discretion and intent.

Debugging Traffic Issues

Besides globally enabling mTLS and setting the outbound traffic policy to be more restrictive, we also recommend setting global.proxy.accessLogFile to log to /dev/stdout instead of /dev/null. This will enable you to view the access logs from the Envoy sidecar within your cluster when debugging Istio configuration and policy issues.

After applying an AuthenticationPolicy or a DestinationRule it is possible that 503 HTTP Status codes will start appearing. Here are a couple of checks to aid you in diagnosing the issue to see if it is related to an mTLS issue.

First, we we will repeat what we did above, with the name of the POD being the pod seeing the 503 HTTP Status return codes:

$ istioctl authn tls-check <PODNAME> <DESTINATION SERVICE FQDN FORMAT>

In most cases this will be all of the debugging you will have to do. However, we can also dig deeper to understand the issue and it never hurts to know more about the underlying infrastructure of your system.

Remember that in a distributed system changes may take a while to propagate through a system and both Pilot and Mixer are responsible for passing configuration and enforcing policy, respectively. Let's start looking at some logs and configuration of sidecars.

By enabling proxy access logs we can view them directly:

$ kubectl logs -n <POD NAMESPACE> <PODNAME> -c istio-proxy 

where you may see logs similar to the following:

[2019-10-07T21:54:37.175Z] "GET /productpage HTTP/1.1" 503 UC "-" "-" 0 95 1 - "-" "curl/7.35.0" "819c2e8b-ddad-4579-8508-794ab7de5a55" "productpage:9080" "XXX.XXX.XXX.XXX:9080" outbound|9080||productpage.bookinfo.svc.cluster.local - XXX.XXX.XXX.XXX:9080 XXX.XXX.XXX.XXX:33834 -
[2019-10-07T21:54:38.188Z] "GET /productpage HTTP/1.1" 503 UC "-" "-" 0 95 1 - "-" "curl/7.35.0" "290b42e7-5140-4881-ae87-778b352adcad" "productpage:9080" "XXX.XXX.XXX.XXX:9080" outbound|9080||productpage.bookinfo.svc.cluster.local - XXX.XXX.XXX.XXX:9080 XXX.XXX.XXX.XXX:33840 -

It is important to note the 503 UC in the above access logs. The UC according to Envoy's documentation states that UC means Upstream connection termination in addition to 503 response code. This helps us understand that it is likely to be an mTLS issue.

If the containers inside of your service mesh contain curl (or equivalent) you can also run the following command within a pod that is experiencing 503s:

$ kubectl exec -c <CONTAINER> <PODNAME> -it -- curl -vv http://<DESTINATIONSERVICE FQDN>:PORT

which may then output something akin to

* Rebuilt URL to: http://productpage.bookinfo.svc.cluster.local:9080/
* Hostname was NOT found in DNS cache
*   Trying XXX.XXX.XXX.XXX...
* Connected to productpage.bookinfo.svc.cluster.local (XXX.XXX.XXX.XXX) port 9080 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.35.0
> Host: productpage.bookinfo.svc.cluster.local:9080
> Accept: */*
< HTTP/1.1 503 Service Unavailable
< content-length: 95
< content-type: text/plain
< date: Mon, 07 Oct 2019 22:09:23 GMT
* Server envoy is not blacklisted
< server: envoy
* Connection #0 to host productpage.bookinfo.svc.cluster.local left intact
upstream connect error or disconnect/reset before headers. reset reason: connection termination

The last line is what's important. HTTP headers were not able to be sent before the underlying TCP connection was terminated. This is a very strong indication that the TLS handshake failed.

And lastly, you can inspect the configuration sent by Pilot to your pod's sidecar using istioctl.

$ istioctl proxy-config cluster -n <POD NAMESPACE> <PODNAME> -o json

whereby if you search for the destination service name you will see an embedded metadata JSON element that names the specific DestinationRule that pod is currently using to communicate with the external service.

    "metadata": {
      "filterMetadata": {
        "istio": {
          "config": "/apis/networking/v1alpha3/namespaces/traffic-generator/destination-rule/named-destrule"

If you look closely at that returned object you can also inspect and verify rules being applied. The source of truth for a given moment is always found in your pod's Envoy sidecar configuration so while you don't need to become an expert and learn all of the nuances of debugging Istio it is another tool in your debugging toolbelt.

The Future

Istio is a incredibly sophisticated and powerful tool. Similar to other such tools, it requires expertise to get the most out of it, but the rewards are greater than the challenge. Aspen Mesh is committed to enabling Istio and our customers to succeed. As our platform matures, we will continue to help users by surfacing use cases and examples like in the above service graph, along with further in-depth ways to diagnose and troubleshoot issues. Lowering the mean time to detect (MTTD) and mean time to resolve (MTTR) for our users is critical to their success.

There are some exciting things that Aspen Mesh is planning to help our users tackle some of the hurdles we've found when adopting Istio. Keep an eye on our blog for future announcements.

Advancing the promise of service mesh: Why I work at Aspen Mesh

The themes and content expressed are mine alone, with helpful insight and thoughts from my colleagues, and are about software development in a business setting.

I’ve been working at Aspen Mesh for a little over a month and during that time numerous people have asked me why I chose to work here, given the opportunities in Boulder and the Front Range.

To answer that question, I need to talk a bit about my background. I’ve been a professional software developer for about 13 years now. During that time I’ve primarily worked on the back-end for distributed systems and have seen numerous approaches to the same problems with various pros and cons. When I take a step back, though, a lot of the major issues that I’ve seen are related to deployment and configuration around service communication:
How do I add a new service to an already existing system? How big, in scope, should these services be? Do I use a message broker? How do I handle discoverability, high availability and fault tolerance? How, and in what format, should data be exchanged between services? How do I audit the system when the system inevitably comes under scrutiny?

I’ve seen many different approaches to these problems. In fact, there are so many approaches, some orthogonal and some similar, that software developers can easily get lost. While the themes are constant, it is time consuming for developers to get up to speed with all of these technologies. There isn’t a single solution that solves every common problem seen in the backend; I’m sure the same applies to the front-end as well. It’s hard to truly understand the pros and cons of an approach until you have a working system; and when that happens and if you then realize that the cons outweigh the pros, it may be difficult and costly to get back to where you started (see sunk cost fallacy and opportunity cost). Conversely, analysis paralysis is also costly to an organization, both in terms of capital—software developers are not cheap—and an inability to quickly adapt to market pressures, be it customer needs and requirements or a competitor that is disrupting the market.

Yet the hype cycle continues. There is always a new shiny thing taking the software world by storm. You see it in discussions on languages, frameworks, databases, messaging protocols, architectures ad infinitum. Separating the wheat from the chaff is something developers must do to ensure they are able to meet their obligations. But with the signal to noise ratio being high at times and with looming deadlines not all possibilities can be explored.  

So as software developers, we have an obligation of due diligence and to be able to deliver software that provides customer value; that helps customers get their work done and doesn’t impede them, but enables them. Most customers don’t care about which languages you use or which databases you use or how you build your software or what software process methodology you adhere to, if any. They just want the software you provide to enable them to do their work. In fact, that sentiment is so strong that slogans have been made around it.

So what do customers care about, generally speaking? They care about access to their data, how they can view it and modify it and draw value from it. It should look and feel modern, but even that isn’t a strict requirement. It should be simple to use for a novice, but yet provide enough advanced capability to help your most advanced users make you learn something new about the tool you’ve created. This is information technology after all. Technology for technology’s sake is not a useful outcome.

Any work that detracts from adding customer value needs to be deprioritized, as there is always more work to do than hours in the day. As developers, it’s our job to be knee deep in the weeds so it’s easy to lose sight of that; unit testing, automation, language choice, cloud provider, software process methodology, etc… absolutely matter, but that they are a means to an end.

With that in mind, let’s create a goal: application developers should be application developers.

Not DevOps engineers, or SREs or CSRs, or any other myriad of roles they are often asked to take on. I’ve seen my peers happiest when they are solving difficult problems and challenging themselves. Not when they are figuring out what magic configuration setting is breaking the platform. Command over their domain and the ability and permission to “fix it” is important to almost every appdev.

If developers are expensive to hire, train, replace and keep then they need to be enabled to do their job to the best of their ability. If a distributed, microservices platform has led your team to solving issues in the fashion of Sherlock Holmes solving his latest mystery, then perhaps you need a different approach.

Enter Istio and Aspen Mesh

It’s hard to know where the industry is with respect to the Hype Cycle for technologies like microservices, container orchestration, service mesh and a myriad of other choices; this isn’t an exact science where we can empirically take measurements. Most companies have older, but proven, systems built on LAMP or Java application servers or monoliths or applications that run on a big iron system. Those aren’t going away anytime soon, and developers will need to continue to support and add new features and capabilities to these applications.

Any new technology must provide a path for people to migrate their existing systems to something new.

If you have decided to or are moving towards a microservice architecture, even if you have a monolith, implementing a service mesh should be among the possibilities explored. If you already have a microservice architecture that leverages gRPC or HTTP, and you're using Kubernetes then the benefits of a service mesh can be quickly realized. It's easy to sign up for our beta and install Aspen Mesh and the sample bookinfo application to see things in action. Once I did is when I became a true believer. Not being coupled with a particular cloud provider, but being flexible and able to choose where and how things are deployed empowers developers and companies to make their own choices.

Over the past month I’ve been able to quickly write application code and get it delivered faster than ever before; that is in large part due to the platform my colleagues have built on top of Kubernetes and Istio. I’ve been impressed by how easy a well built cloud-native architecture can make things, and learning more about where Aspen Mesh, Istio and Kubernetes are heading gives me confidence that community and adoption will continue to grow.

As someone that has dealt with distributed systems issues continuously throughout his career, I know managing and troubleshooting a distributed system can be exhausting. I just want to enable others, even Aspen Mesh as we dogfood our own software, to do their jobs. To enable developers to add value and solve difficult problems. To enable a company to monitor their systems, whether it be mission critical or a simple CRUD application, to help ensure high uptime and responsiveness. To enable systems to be easily auditable when the compliance personnel has GRDP, PCI DSS or HIPAA concerns. To enable developers to quickly diagnose issues within their own system, fix them and monitor the change. To enable developers to understand how their services are communicating with each other--if it’s an n-tier system or a spider’s web--and how requests propagate through their system.

The value of Istio and the benefits of Aspen Mesh in solving these challenges is what drew me here. The opportunities are abundant and fruitful. I get to program in go, in a SaaS environment and on a small team with a solid architecture. I am looking forward to becoming a part of the larger CNCF community. With microservices and cloud computing no longer being niche--which I’d argue hasn’t been the case for years--and with businesses adopting these new technology patterns quickly, I feel as if I made the right long-term career choice.