Advancing the promise of service mesh: Why I work at Aspen Mesh

The themes and content expressed are mine alone, with helpful insight and thoughts from my colleagues, and are about software development in a business setting.

I’ve been working at Aspen Mesh for a little over a month and during that time numerous people have asked me why I chose to work here, given the opportunities in Boulder and the Front Range.

To answer that question, I need to talk a bit about my background. I’ve been a professional software developer for about 13 years now. During that time I’ve primarily worked on the back-end for distributed systems and have seen numerous approaches to the same problems with various pros and cons. When I take a step back, though, a lot of the major issues that I’ve seen are related to deployment and configuration around service communication:
How do I add a new service to an already existing system? How big, in scope, should these services be? Do I use a message broker? How do I handle discoverability, high availability and fault tolerance? How, and in what format, should data be exchanged between services? How do I audit the system when the system inevitably comes under scrutiny?

I’ve seen many different approaches to these problems. In fact, there are so many approaches, some orthogonal and some similar, that software developers can easily get lost. While the themes are constant, it is time consuming for developers to get up to speed with all of these technologies. There isn’t a single solution that solves every common problem seen in the backend; I’m sure the same applies to the front-end as well. It’s hard to truly understand the pros and cons of an approach until you have a working system; and when that happens and if you then realize that the cons outweigh the pros, it may be difficult and costly to get back to where you started (see sunk cost fallacy and opportunity cost). Conversely, analysis paralysis is also costly to an organization, both in terms of capital—software developers are not cheap—and an inability to quickly adapt to market pressures, be it customer needs and requirements or a competitor that is disrupting the market.

Yet the hype cycle continues. There is always a new shiny thing taking the software world by storm. You see it in discussions on languages, frameworks, databases, messaging protocols, architectures ad infinitum. Separating the wheat from the chaff is something developers must do to ensure they are able to meet their obligations. But with the signal to noise ratio being high at times and with looming deadlines not all possibilities can be explored.  

So as software developers, we have an obligation of due diligence and to be able to deliver software that provides customer value; that helps customers get their work done and doesn’t impede them, but enables them. Most customers don’t care about which languages you use or which databases you use or how you build your software or what software process methodology you adhere to, if any. They just want the software you provide to enable them to do their work. In fact, that sentiment is so strong that slogans have been made around it.

So what do customers care about, generally speaking? They care about access to their data, how they can view it and modify it and draw value from it. It should look and feel modern, but even that isn’t a strict requirement. It should be simple to use for a novice, but yet provide enough advanced capability to help your most advanced users make you learn something new about the tool you’ve created. This is information technology after all. Technology for technology’s sake is not a useful outcome.

Any work that detracts from adding customer value needs to be deprioritized, as there is always more work to do than hours in the day. As developers, it’s our job to be knee deep in the weeds so it’s easy to lose sight of that; unit testing, automation, language choice, cloud provider, software process methodology, etc… absolutely matter, but that they are a means to an end.

With that in mind, let’s create a goal: application developers should be application developers.

Not DevOps engineers, or SREs or CSRs, or any other myriad of roles they are often asked to take on. I’ve seen my peers happiest when they are solving difficult problems and challenging themselves. Not when they are figuring out what magic configuration setting is breaking the platform. Command over their domain and the ability and permission to “fix it” is important to almost every appdev.

If developers are expensive to hire, train, replace and keep then they need to be enabled to do their job to the best of their ability. If a distributed, microservices platform has led your team to solving issues in the fashion of Sherlock Holmes solving his latest mystery, then perhaps you need a different approach.

Enter Istio and Aspen Mesh

It’s hard to know where the industry is with respect to the Hype Cycle for technologies like microservices, container orchestration, service mesh and a myriad of other choices; this isn’t an exact science where we can empirically take measurements. Most companies have older, but proven, systems built on LAMP or Java application servers or monoliths or applications that run on a big iron system. Those aren’t going away anytime soon, and developers will need to continue to support and add new features and capabilities to these applications.

Any new technology must provide a path for people to migrate their existing systems to something new.

If you have decided to or are moving towards a microservice architecture, even if you have a monolith, implementing a service mesh should be among the possibilities explored. If you already have a microservice architecture that leverages gRPC or HTTP, and you're using Kubernetes then the benefits of a service mesh can be quickly realized. It's easy to sign up for our beta and install Aspen Mesh and the sample bookinfo application to see things in action. Once I did is when I became a true believer. Not being coupled with a particular cloud provider, but being flexible and able to choose where and how things are deployed empowers developers and companies to make their own choices.

Over the past month I’ve been able to quickly write application code and get it delivered faster than ever before; that is in large part due to the platform my colleagues have built on top of Kubernetes and Istio. I’ve been impressed by how easy a well built cloud-native architecture can make things, and learning more about where Aspen Mesh, Istio and Kubernetes are heading gives me confidence that community and adoption will continue to grow.

As someone that has dealt with distributed systems issues continuously throughout his career, I know managing and troubleshooting a distributed system can be exhausting. I just want to enable others, even Aspen Mesh as we dogfood our own software, to do their jobs. To enable developers to add value and solve difficult problems. To enable a company to monitor their systems, whether it be mission critical or a simple CRUD application, to help ensure high uptime and responsiveness. To enable systems to be easily auditable when the compliance personnel has GRDP, PCI DSS or HIPAA concerns. To enable developers to quickly diagnose issues within their own system, fix them and monitor the change. To enable developers to understand how their services are communicating with each other--if it’s an n-tier system or a spider’s web--and how requests propagate through their system.

The value of Istio and the benefits of Aspen Mesh in solving these challenges is what drew me here. The opportunities are abundant and fruitful. I get to program in go, in a SaaS environment and on a small team with a solid architecture. I am looking forward to becoming a part of the larger CNCF community. With microservices and cloud computing no longer being niche--which I’d argue hasn’t been the case for years--and with businesses adopting these new technology patterns quickly, I feel as if I made the right long-term career choice.


Enterprise service mesh

Aspen Mesh Open Beta Makes Istio Enterprise-ready

As companies build modern applications, they are leveraging microservices to effectively build and manage them. As they do, they realize that they are increasing complexity and de-centralizing ownership and control. These new challenges require a new way to monitor, manage and control microservice-based applications at runtime.

A service mesh is an emerging pattern that is helping ensure resiliency and uptime - a way to more effectively monitor, control and secure the modern application at runtime. Companies are adopting Istio as their service mesh of choice as it provides a toolbox of different features that address various microservices challenges. Istio provide a solution to many challenges, but leaves some critical enterprise challenges on the table. Enterprises require additional features that address observability, policy and security. With this in mind, we have built new enterprise features into a platform that runs on top of Istio, to provide all the functionality and flexibility of open source, plus features, support and guarantees needed to power enterprise applications.

At KubeCon North America 2018, Aspen Mesh announced open beta. With Aspen Mesh you get all the features of Istio, plus:

Advanced Policy Features
Aspen Mesh provides RBAC capabilities you don’t get with Istio.

Configuration Vets
Istio Vet (an Aspen Mesh open source contribution to help ensure correct configuration of your mesh) is built into Aspen Mesh and you get additional features as part of Aspen Mesh that don’t come with open source Istio Vet.

Analytics and Alerting
The Aspen Mesh platform provides insights into key metrics (latency, error rates, mTLS status) and immediate alerts so you can take action to minimize MTTD/MTTR.

Multi-cluster/Multi-cloud
See multiple clusters that live in different clouds in a single place to see what’s going on in your microservice architecture through a single pane of glass.

Canary Deploys
Aspen Mesh Experiments lets you quickly test new versions of microservices so you can qualify new versions in a production environment without disrupting users.

An Intuitive UI
Get at-a-glance views of performance and security posture as well as the ability to see service details.

Full Support
Our team of Istio experts makes it easy to get exactly what you need out of service mesh. 

You can take advantage of these features for free by signing up for Aspen Mesh Beta access.


Security

A Service Mesh Helps Simplify PCI DSS Compliance

PCI DSS is an information security standard for organizations that handle credit card data. The requirements are largely around developing and maintaining secure systems and applications and providing appropriate levels of access — things a service mesh can make easier.

However, building a secure, reliable and PCI DSS-compliant microservice architecture at scale is a difficult undertaking, even when using a service mesh.

It requires, for example, 12 separate requirements, each of which has different sub-requirements. Additionally, some of the requirements are vague and are left up to the designated Qualified Security Assessor (QSA) to make their best judgment based on the design in question.

Meeting these requirements involves:

  • Controlling what services can talk to each other;
  • Guaranteeing non-repudiation for actors making requests;
  • Building accurate real-time and historical diagrams of cardholder data flows across systems and networks when services can be added, removed or updated at a team’s discretion.

Achieving PCI DSS compliance at scale can be simplified by implementing a uniform layer of infrastructure between services and the network that provides your operations team with centralized policy management and decouples them from feature development and release processes, regardless of scale and release velocity. This layer of infrastructure is commonly referred to as a service mesh. A service mesh provides many features that simplify compliance management, such as fine-grained control over service communication, traffic encryption, non-repudiation via service to service authentication with strong identity assertions, and rich telemetry and reporting.

Below are some of the key PCI DSS requirements listed in the 3.2.1 version of the requirements document, where a service mesh helps simplify the implementation of both controls and reporting:

  • Requirement 1: Install and maintain a firewall configuration to protect cardholder data
    • The first requirement focuses on firewall and router configurations that ensure cardholder data is only accessed when it should be and only by authorized sources.
  • Requirement 6: Develop and maintain secure systems and applications
    • The applicable portions of this requirement focus on encrypting network traffic using strong cryptography and restricting user access to URLs and functions.
  • Requirement 7: Restrict access to cardholder data by business need to know.
    • Arguably one of the most critical requirements in PCI DSS, since even the most secure system can be easily circumvented by overprivileged employees. This requirement focuses on restricting privileged users to least privileges necessary to perform job responsibilities, ensuring access to systems are set to “deny all” by default, and ensuring proper documentation detailing roles and responsibilities are in place.
  • Requirement 8: Identify and authenticate access to system components
    • Building on the foundation of requirement 7, this requirement focuses on ensuring all users have a unique ID; controlling the creation, deletion and modification of identifier objects; revoking access; utilizing strong cryptography during credential transmission; verifying user identity before modifying authentication credentials.
  • Requirement 10: Track and monitor all access to network resources and cardholder data
    • This requirement puts a heavy emphasis on designing and implementing a secure, reliable and accurate audit trail of all events within the environment. This includes capturing all individual user access to cardholder data, invalid logical access attempts and intersystem communication logs. All audit trail entries should include: user identification, type of event, date and time, success or failure indication, origin of event and the identity of the affected data or resource.

Let’s review how Aspen Mesh, the enterprise-ready service mesh built on Istio, helps simplify both the implementation and reporting of controls for the above requirements.

‘Auditable’ Real-time and Historical Dataflows

Keeping records of how data flows through your system is one of key requirements of PCI DSS compliance (1.1.3), as well as a security best practice. With Aspen Mesh, you can see a live view of your service-to-service communications and retrieve historical configurations detailing what services were communicating, their corresponding security configuration (e.g. whether or not mutual TLS was enabled, certificate thumbprint and validity period, internal IP address and protocol) and what the entire cluster configuration was at that point in time (minus secrets of course).

Encrypting Traffic and Achieving Non-Repudiation for Requests

Defense in depth is an industry best practice when securing sensitive data. Aspen Mesh can automatically encrypt and decrypt service requests and responses without development teams having to write custom logic per service. This helps reduce the amount of non-functional feature development work teams are required to do prior to delivering their services in a secure and compliant environment. The mesh also provides non-repudiation for requests by authenticating clients via mutual TLS and through a key management system that automates key and certificate generation, distribution and rotation, so you can be sure requests are actually coming from who they say they’re coming from. Encrypting traffic and providing non-repudiation is key to implementing controls that ensure sensitive data, such as a Primary Account Number (PAN), are protected from unauthorized actors sniffing traffic, and to providing definitive proof for auditing events as required by PCI DSS requirements 10.3.1, 10.3.5, and 10.3.6.

Strong Access Control and Centralized Policy Management

Aspen Mesh provides flexible and highly performant Role-Based Access Control (RBAC) via centralized policy management. RBAC allows you to easily implement and report on controls for requirements 7 and 8 – Implement Strong Access Control Measures. With policy control, you can easily define what services are allowed to communicate, what methods services can call, rate limit requests, and define and enforce quotas.

Centralized Tracing, Monitoring and Alerting

One of the most difficult non-functional features to implement at scale is consistent and reliable application tracing. With Aspen Mesh, you get reliable and consistent in-depth tracing between all services within the mesh, configurable real-time dashboards, the ability to create criteria driven alerts and the ability to retain your logs for at least one year — which exceeds requirements that dictate a minimum of three months data be immediately available for analysis for PCI DSS requirements 10.1, 10.2-10.2.4, 10.3.1-10.3.6 and 10.7.

Aspen Mesh Makes it Easier to Implement and Scale a Secure and PCI DSS Compliant Microservice Environment

Managing a microservice architecture at scale is a serious challenge without the right tools. Having to ensure each service follows the proper organizational secure communication, authentication, authorization and monitoring policies needed to comply with PCI DSS is not easy to achieve.

Achieving PCI DSS compliance involves addressing a number of different things around firewall configuration, developing applications securely and fine-grained RBAC. These are all distinct development efforts that can be hard to achieve individually, but even harder to achieve as a coordinated team. The good news is, with the help of Aspen Mesh, your engineering team can spend less time building and maintaining non-functional yet essential features, and more time building features that provide direct value to your customers.

Learn More About Security and Service Mesh

Interested in learning more about how service mesh can help you achieve security? Get the free white paper on achieving Zero-trust security for containerized applications.

Originally posted on The New Stack

DevOps and service mesh

How Service Mesh Enables DevOps

I spend most of my day talking to large companies about how they are transforming their businesses to compete in an increasingly disruptive environment. This isn’t anything new, anyone who has read Clayton Christensen’s Innovator’s Dilemma understands this. What’s most interesting to me is how companies are addressing disruption. Of course, they are creating new products to remain competitive with the disruptors, but they are also taking a page out of their smaller, more nimble competitors’ playbook and focusing on being more efficient.

Companies are transforming internal organizations and product architectures along a new axis of performance. They are finding more value in iterations, efficiency and incremental scaling which is forcing them to adopt DevOps methodologies. This focus on time-to-market is driving some of the most cutting-edge infrastructure technology that we have ever seen. Technologies like containers and Kubernetes; and a focus on stable, consistent and open APIs allow small teams to make amazing progress and move at the speeds they require. These technologies have reduced the friction and time to market and the result is the quickest adoption of a new technology that anyone has ever seen.

The adoption of these technologies isn’t perfect, and as companies deploy them at scale they realize that they have inadvertently increased complexity and de-centralized ownership and control. In many cases, it can be impossible to understand the entire system and everyone needs to be an expert in compliance and business needs. Ultimately this means that when everyone is responsible, no-one is accountable.

A service mesh enables DevOps by helping you to manage this complexity. It provides autonomy and freedom for development teams while simultaneously providing a place for teams of experts to enforce company standards for policy and security. It does this by providing a layer between your teams’ applications and the platform they are running on that allows platform operators a place to insert network services, enforce policy and collect telemetry and tracing data.

This empowers your development teams to make choices based on the problem they are solving rather than being concerned with the underlying infrastructure. Dev teams now have the freedom to deploy code without the fear of violating compliance or regulatory guidelines. Secure communication is handled outside of the application reducing complexity and risk. A service mesh also provides tools that developers can use to deploy new code and debug or troubleshoot problems when they come up.

For the platform operator, whose primary objective is to provide a stable, secure and scalable service to run applications, a service mesh provides uniformity through a standardization of visibility and tracing. Policy and authentication between services can be introduced outside of the application at runtime ensuring that applications are adhering to any regulatory requirements the business may have. Deploying Aspen Mesh provides a robust experiments workflow to enable development teams to test new services using real production traffic. Our platform also provides tools that reduce mean-time-to-detection (MTTD) and mean-time-to-resolution (MTTR) with advanced analytics that are part of our SaaS portal.

DevOps represents two teams, Development and Operations, coming together to deliver better products more rapidly. Service mesh is a glue that helps unite these teams and provides one place in the stack that you can manage microservices at runtime without changes to the application or cluster.

The result is a platform that empowers application developers to focus on their code, and allows operators to more easily provide developers with a resilient, scalable and secure environment.


service mesh

How The Service Mesh Space Is Like Preschool

I have a four year old son who recently started attending full day preschool. It has been fascinating to watch his interests shift from playing with stuffed animals and pushing a corn popper to playing with his science set (w00t for the STEM lab!) and riding his bike. The other kids in school are definitely informing his view of what cool new toys he needs. Undoubtedly, he could still make due with the popper and stuffed animals (he may sleep with Lambie until he's ten), but as he progresses his desire to explore new things increases.

Watching the community around service mesh develop is similar to watching my son's experience in preschool (if you're willing to make the stretch with me). People have come together in a new space to learn about cool new things, and as excited as they are, they don't completely understand the cool new things. Just as in preschool, there are a ton of bright minds that are eager to soak up new knowledge and figure out how to put it to good use.

Another parallel between my son and many of the people we talk to in the service mesh space is that they both have a long and broad list of questions. In the case of my son, it's awesome because they're questions like: "Is there a G in my name?" "What comes after Sunday?" "Does God live in the sky with the unicorns?" The questions we get from prospects and clients on service mesh are a bit different but equally interesting. It would take more time than anybody wants to spend to cover all these questions, but I thought it might be interesting to cover the top 3 questions we get from users evaluating service mesh.

What do I get with a service mesh?

We like getting this question because the answer to it is a good one. You get a toolbox that gives you a myriad of different capabilities. At a high level, what you get is observability, control and security of your microservice architecture. The features that a service mesh provide include:

  • Load balancing
  • Service discovery
  • Ingress and egress control
  • Distributed tracing
  • Metrics collection and visualization
  • Policy and configuration enforcement
  • Traffic routing
  • Security through mTLS

When do I need a service mesh?

You don't need 1,000 microservices for a service mesh to make sense. If you have nicknames for your monoliths, you're probably a ways away from needing a service mesh. And you probably don't need one if you only have 2 services, but if you have a few services and plan to continue down the microservices path it is easier to get started sooner. We are believers that containers and Kubernetes will be the way companies build infrastructure in the future, and waiting to hop on that train will only be a competitive disadvantage. Generally, we find that the answer to this question usually hinges on whether or not you are committed to cloud native. Service meshes like Aspen mesh work seamlessly with cloud native tools so the barrier to entry is low, and running cloud native applications will be much easier with the help of a service mesh.

What existing tools does service mesh allow me to replace?

This answer all depends on what functionality you want. Here's a look at tools that service mesh overlaps, what it provides and what you'll need to keep old tools for.

API gateway
Not yet. It replaces some of the functionality of a API gateway but does not yet cover all of the ingress and payment features an API gateway provides. Chances are API gateways and service meshes will converge in the future.

Tracing Tools
You get tracing capabilities as part of Istio. If you are using distributed tracing tools such as Jaeger or Zipkin, you no longer need to continue managing them separately as they are part of the Istio toolbox. With Aspen Mesh's hosted SaaS platform, we offer managed Jaeger so you don't even need to deploy or manage them.

Metrics Tools
Just like tracing, a metrics monitoring tool is included as part of Istio.With Aspen Mesh's hosted SaaS platform, we offer managed Prometheus and Grafana so you don't even need to deploy or manage them. Istio leverages Prometheus to query metrics. You have the option of visualizing them through the Prometheus UI, or using Grafana dashboards.

Load Balancing
Yep. Envoy is the sidecar proxy used by Istio and provides load balancing functionality such as automatic retries, circuit breaking, global rate limiting, request shadowing and zone local load balancing. You can use a service mesh in place of tools like HAProxy NGINX for ingress load balancing.

Security tools
Istio provides mTLS capabilities that address some important microservices security concerns. If you’re using SPIRE, you can definitely replace it with Istio which provides a more comprehensive utilisation of the SPIFFE framework. An important thing to note is that while a service mesh adds several important security features, it is not the end-all-be-all for microservices security. It’s important to also consider a strategy around network security.

If you have little ones and would be interested in comparing notes on the fantastic questions they ask, let’s chat. I'd also love to talk anything service mesh. We have been helping a broad range of customers get started with Aspen Mesh and make the most out of it for their use case. We’d be happy to talk about any of those experiences and best practices to help you get started on your service mesh journey. Leave a comment here or hit me up @zjory.


Container orchestration

Going Beyond Container Orchestration

Every survey of late tells the same story about containers; organizations are not only adopting but embracing the technology. Most aren't relying on containers with the same degree of criticality as hyperscale organizations. That means they are one of the 85% of organizations IDC found in a Cisco-sponsored survey of over 8000 enterprises are using containers in production. That sounds impressive, but the scale at which they use them is limited. In a Forrester report commissioned by Dell EMC, Intel, and Red Hat, 63% of enterprises using containers have more than 100 instances running. 82% expect to be doing the same by 2019. That's a far cry from the hundreds of thousands in use by hyperscale technology companies.

And though the adoption rate is high, that's not to say that organizations haven't dabbled with containers only to abandon the effort. As with any (newish) technology, challenges exist. At the top of the list for containers are suspects you know and love: networking and management.

Some of the networking challenges are due to the functionality available in popular container orchestration environments like Kubernetes. Kubernetes supports microservices architectures through its service construct. This allows developers and operators to abstract the functionality of a set of pods and expose it as "a service" with access via a well-defined API. Kubernetes supports naming services as well as performing rudimentary layer 4 (TCP-based) load balancing.

The problem with layer 4 (TCP-based) load balancing is its inability to interact with layer 7 (application and API layers). This is true for any layer 4 load balancing; it's not something unique to containers and Kubernetes. Layer 4 offers visibility into connection level (TCP) protocols and metrics, but nothing more. That makes it difficult (impossible, really) to address higher-order problems such as layer 7 metrics like requests or transactions per second and the ability to split traffic (route requests) based on path. It also means you can't really do rate limiting at the API layer or support key capabilities like retries and circuit breaking.

The lack of these capabilities drives developers to encode them into each microservice instead. That results in operational code being included with business logic. This should cause some amount of discomfort, as it clearly violates the principles of microservice design. It's also expensive as it adds both architectural and technical debt to microservices.

Then there's management. While Kubernetes is especially adept at handling build and deploy challenges for containerized applications, it lacks key functionality needed to monitor and control microservice-based apps at runtime. Basic liveliness and health probes don't provide the granularity of metrics or the traceability needed for developers and operators to quickly and efficiently diagnose issues during execution. And getting developers to instrument microservices to generate consistent metrics can be a significant challenge, especially when time constraints are putting pressure on them to deliver customer-driven features.

These are two of the challenges a service mesh directly addresses: management and networking.

How Service Mesh Answers the Challenge

Both are more easily addressed by the implementation of a service mesh as a set of sidecar proxies. By plugging directly into the container environment, sidecar proxies enable transparent networking capabilities and consistent instrumentation. Because all traffic is effectively routed through the sidecar proxy, it can automatically generate and feed the metrics you need to the rest of the mesh. This is incredibly valuable for those organizations that are deploying traditional applications in a container environment. Legacy applications are unlikely to be instrumented for a modern environment. The use of a service mesh and its sidecar proxy basis enable those applications to emit the right metrics without requiring code to be added/modified.

It also means that you don't have to spend your time reconciling different metrics being generated by a variety of runtime agents. You can rely on one source of truth - the service mesh - to generate a consistent set of metrics across all applications and microservices.

Those metrics can include higher order data points that are fed into the mesh and enable more advanced networking to ensure fastest available responses to requests. Retry and circuit breaking is handled by the sidecar proxy in a service mesh, relieving the developer from the burden of introducing operational code into their microservices. Because the sidecar proxy is not constrained to layer 4 (TCP), it can support advanced message routing techniques that rely on access to layer 7 (application and API).

Container orchestration is a good foundation, but enterprise organizations need more than just a good foundation. They need the ability to interact with services at the upper layers of the stack, where metrics and modern architectural patterns are implemented today.

Both are best served by a service mesh. When you need to go beyond container orchestration, go service mesh.


API Gateway vs Service Mesh

API Gateway vs Service Mesh

One of the recurring questions we get when talking to people about a service mesh is, "How is it different from an API gateway?" It's a good question. The overlap between API gateway and service mesh patterns is significant. They can both handle service discovery, request routing, authentication, rate limiting and monitoring. But there are differences in architectures and intentions. A service mesh's primary purpose is to manage internal service-to-service communication, while an API Gateway is primarily meant for external client-to-service communication.

API Gateway and service mesh

API Gateway and Service Mesh: Do You Need Both?

You may be wondering if you need both an API gateway and a service mesh. Today you probably do, but as service mesh evolves, we believe it will incorporate much of what you get from an API gateway today.

The main purpose of an API gateway is to accept traffic from outside your network and distribute it internally. The main purpose of a service mesh is to route and manage traffic within your network. A service mesh can work with an API gateway to efficiently accept external traffic then effectively route that traffic once it's in your network. The combination of these technologies can be a powerful way to ensure application uptime and resiliency, while ensuring your applications are easily consumable.

In a deployment with an API gateway and a service mesh, incoming traffic from outside the cluster would first be routed through the API gateway, then into the mesh. The API gateway could handle authentication, edge routing and other edge functions, while the service mesh provides fine-grained observability of and control of your architecture.

The interesting thing to note is that service mesh technologies are quickly evolving and are starting to take on some of the functions of an API gateway. A great example is the introduction of the Istio v1alpha3 routing API which is available in Aspen Mesh 1.0. Prior to this, Istio had used Kubernetes ingress control which is pretty basic so it made sense to use an API gateway for better functionality. But, the increased functionality introduced by the v1alpha3 API has made it easier to manage large applications and to work with with protocols other than HTTP, which was previously something an API gateway was needed to do effectively.

What The Future Holds

The v1alpha3 API provides a good example of how a service mesh is reducing the need for API gateway capabilities. As the cloud native space evolves and more organizations move to using Docker and Kubernetes to manage their microservice architectures, it seems highly likely that service mesh and API gateway functionality will merge. In the next few years, we believe that standalone API gateways will be used less and less as much of their functionality will be absorbed by service mesh.

If you have any questions about service mesh along the way, feel free to reach out.


Aspen Mesh Enterprise Service Mesh

Enabling the Financial Services Shift to Microservices

Financial services has historically been an industry riddled with barriers to entry. Challengers found it difficult to break through low margins and tightening regulations. However, large enterprises that once dominated the market are now facing disruption from smaller, leaner fintech companies that are eating away at the value chain. These disruptors are marked by technological agility, specialization and customer-centric UX. To remain competitive, financial services firms are reconsidering their cumbersome technical architectures and transforming them into something more adaptable. A recent survey of financial institutions found that ~85% consider their core technology to be too rigid and slow. Consequently, ~80% are expected to replace their core banking systems within the next five years.

Emerging regulations meant to address the new digital payment economy, such as PSD2 regulations in Europe, will require banks to adopt a new way to operate and deliver. Changes like PSD2 are aimed at bringing banking into the open API economy, driving interoperability and integration through open standards. To become a first class player in this new world of APIs, integration, and open data, financial services firms will need the advantages provided by microservices.

Microservices provide 3 key advantages for financial services

Enhanced Security

Modern fintech requirements create challenges to the established security infrastructure. Features like digital wallet, robo advisory and blockchain mandate the need for a new security mechanisms. Microservices follow a best practice of creating a separate identity service which addresses these new requirements.

Faster Delivery

Rapidly bringing new features to market is a cornerstone of successful fintech companies. Microservices make it easier for different application teams to independently deliver new functionality to meet emerging customer demands. Microservices also scale well to accommodate greater numbers of users and transactions..

Seamless Integration

The integration layer in a modern fintech solution needs a powerful set of APIs to communicate with other services, both internally and externally. This API layer is notoriously challenging to manage in a large monolithic application. Microservices make the API layer much easier to manage and secure through isolation, scalability and resilience.

Service mesh makes it easier to manage a complex microservice architecture

In the face of rapidly changing customer, business and regulatory requirements, microservices help financial services companies quickly respond to these changes.. But this doesn’t come for free. Companies take on increased operational overhead during the shift to microservices – technologies such as a service mesh can help manage that.

Service mesh provides a bundle of features around observability, security, and control that are crucial to managing microservices at scale. Previously existing solutions like DNS and configuration management provide some capabilities such as service discovery, but didn’t provide fast retries, load balancing, tracing and health monitoring. The old approach to managing microservices requires that you cobble together several different solutions each time a problem arises, but a service mesh bundles it all together in a reusable package. While it’s possible to accomplish some of what a service mesh manages with individual tools and processes, it’s manual and time consuming.

Competition from innovative fintech startups, along with ever increasing  customer expectations means established financial services players must change the way they deliver offerings and do business with their customers. Delivering on these new requirements is difficult with legacy systems. Financial services firms need a software architecture that’s fit for purpose – agile, adaptable, highly scalable, reliable and robust. Microservices make this possible, and a service mesh makes microservices manageable at scale.


Microservices challenges

How Service Mesh Addresses 3 Major Microservices Challenges

I was recently reading the Global Microservices Trends report by Dimensional Research and found myself thinking "a service mesh could help with that." So I thought I would cover those 3 challenges and how a service mesh addresses them. Respondents cited in the report make it clear microservices are gaining widespread adoption. It's also clear that along with the myriad of benefits they bring, there are also tough challenges that come as part of the package. The report shows:

91% of enterprises are using microservices or have plans to
99% of users report challenges with using microservices

Major Microservices Challenges

The report identifies a range of challenges companies are facing.

Companies are seeing a mix of technology and organizational challenges. I'll focus on the technological challenges a service mesh solves, but it's worth noting that one thing a service mesh does is bring uniformity so it's possible to achieve the same view across teams which can reduce the need for certain skills.

Each additional microservice increases the operational challenges

Not with a service mesh! A service mesh provides monitoring, scalability, and high availability through APIs instead of using discrete appliances. This flexible framework removes the operational complexity associated with modern applications. Infrastructure services were traditionally implemented as discrete appliances, which meant going to the actual appliance to get the service. Each appliance is unique which makes monitoring, scaling, and providing high availability for each appliance hard. A service mesh delivers these services inside the compute cluster itself through APIs and doesn’t require any additional appliances. Implementing a service mesh means adding new microservices doesn't have to add complexity.

It is harder to identify the root cause of performance issues

The service mesh toolbox gives you a couple of things that help solve this problem:

Distributed Tracing
Tracing provides service dependency analysis for different microservices and tracking for requests as they are traced through multiple microservices. It’s also a great way to identify performance bottlenecks and zoom into a particular request to define things like which microservice contributed to the latency of a request or which service created an error.

Metrics Collection
Another powerful thing you gain with service mesh is the ability to collect metrics. Metrics are key to understanding historically what has happened in your applications, and when they were healthy compared to when they were not. A service mesh can gather telemetry data from across the mesh and produce consistent metrics for every hop. This makes it easier to quickly solve problems and build more resilient applications in the future.

Differing development languages and frameworks

Another major challenge that report respondents noted facing was the challenge of maintaining a distributed architecture in a polyglot world. When making the move from monolith to microservices, many companies struggle with the reality that to make things work, they have to use different languages and tools. Large enterprises can be especially affected by this as they have many large, distributed teams. Service mesh provides uniformity by providing programming-language agnosticism, which addresses inconsistencies in a polyglot world where different teams, each with its own microservice, are likely to be using different programming languages and frameworks. A mesh also provides a uniform, application-wide point for introducing visibility and control into the application runtime, moving service communication out of the realm of implied infrastructure, to where it can be easily seen, monitored, managed and controlled.

Microservices are cool, but service mesh makes them ice cold. If you're on the microservices journey and are finding it difficult to manage the infrastructure challenges, a service mesh may be the right answer. Let us know if you have any questions on how to get the most out of service mesh, our engineering team is always available to talk.


Tracing and Metrics: Getting the Most Out of Istio

Are you considering or using a service mesh to help manage your microservices infrastructure? If so, here are some basics on how a service mesh can help, the different architectural options, and tips and tricks on using some key CNCF tools that integrate well with Istio to get the most out of it.

The beauty of a service mesh is that it bundles so many capabilities together, freeing engineering teams from having to spend inordinate amounts of time managing microservices architectures. Kubernetes has solved many build and deploy challenges, but it is still time consuming and difficult to ensure reliability and security at runtime. A service mesh handles the difficult, error-prone parts of cross-service communication such as latency-aware load balancing, connection pooling, service-to-service encryption, instrumentation, and request-level routing.

Once you have decided a service mesh makes sense to help manage your microservices, the next step is deciding what service mesh to use. There are several architectural options, from the earliest model of a library approach, the node agent architecture, and the model which seems to be gaining the most traction – the sidecar model. We have also seen an evolution from data plane proxies like Envoy, to service meshes such as Istio which provide distributed control and data planes. We're active users of Istio, and believers in the sidecar architecture striking the right balance between a robust set of features and a lightweight footprint, so let’s take a look at how to get the most out of tracing and metrics with Istio.

Tracing

One of the capabilities Istio provides is distributed tracing. Tracing provides service dependency analysis for different microservices and it provides tracking for requests as they are traced through multiple microservices. It’s also a great way to identify performance bottlenecks and zoom into a particular request to define things like which microservice contributed to the latency of a request or which service created an error.

We use and recommend Jaeger for tracing as it has several advantages:

  • OpenTracing compatible API
  • Flexible & scalable architecture
  • Multiple storage backends
  • Advanced sampling
  • Accepts Zipkin spans
  • Great UI
  • CNCF project and active OS community

Metrics

Another powerful thing you gain with Istio is the ability to collect metrics. Metrics are key to understanding historically what has happened in your applications, and when they were healthy compared to when they were not. A service mesh can gather telemetry data from across the mesh and produce consistent metrics for every hop. This makes it easier to quickly solve problems and build more resilient applications in the future.

We use and recommend Prometheus for gathering metrics for several reasons:

  • Pull model
  • Flexible query API
  • Efficient storage
  • Easy integration with Grafana
  • CNCF project and active OS community

We also use Cortex, which is a powerful tool to enhance Prometheus. Cortex provides:

  • Long term durable storage
  • Scalable Prometheus query API
  • Multi-tenancy

Check out this webinar for a deeper look into what you can do with these tools and more.