Improving Microservices: Weighing Service Mesh Options and Benefits

Microservices are a hell of a drug. 

Rapid development. Easier testing and deployment. Applications that are simpler to change and maintain. 

It’s easy to understand why microservice-based applications are becoming more and more common. Through microservice architectures, enterprises are realizing: 

  • Improved scalability
  • Increased development velocity
  • Easier debugging
  • Better alignment between development and user requirements 

As companies build or convert to more modern applications, they are leveraging microservices to drive differentiation and market leadership. As a side effect, they realize that they are increasing complexity and decentralizing ownership and control. These new challenges require new solutions to effectively monitor, manage and control microservice-based applications at runtime.

Kubernetes has become the defacto method for enterprises to orchestrate containers. Kubernetes simplifies the work of technical teams by automating application processes and service deployments that were previously performed manually. Kubernetes is a superb tool for managing containerized application deployment challenges but leaves some runtime challenges on the table. 

That’s where service mesh comes in. A service mesh like Aspen Mesh adds observability, security and policy capabilities to Kubernetes. A service mesh helps to ensure resiliency and uptime – it provides solutions that enable engineering teams to more effectively monitor, control and secure the modern application at runtime. Companies are adopting service mesh as a way to enhance Kubernetes, as it provides a toolbox of features that address various microservices challenges that modern enterprises are facing.

Thoughts on the state of microservices from OSCON

Having attended OSCON last week, it was interesting to have discussions with people spread across the microservices, Kubernetes and service mesh adoption curves. While it was clear that almost everyone is at least considering microservices, many are still waiting to see their peers implement before deciding on their own path forward. An interesting takeaway was that more and more organizations are looking to microservices for brownfield deployments, whereas even a couple of years ago almost everyone only considered building microservices architectures for greenfield. The conversations around brownfield signaled to me that as microservices technology and tooling continues to evolve, it is more feasible for non-unicorn companies to effectively and efficiently decompose the monolith into microservices. 

Another observation is that Kubernetes use is starting to catch up to the hype. The decision to use Kubernetes for container orchestration was nearly unanimous among the OSCON attendees I spoke with. They were in various phases of implementation with some still just running POCs to evaluate the best use cases, but many conversations were centered on how companies are running Kubernetes in production for mission critical applications. 

Among those humming along with Kubernetes, many were interested in service mesh as a way to extend or enhance what they are getting from Kubernetes. The top three reasons people said they want to implement service mesh were: 

  • Observability - to better understand the behavior of Kubernetes clusters 
  • mTLS - to add cluster-wide service encryption
  • Distributed Tracing - to simplify debugging and speed up RCA

Gauging the cloud-native infrastructure space after OSCON, there is no doubt that there is still more exploration and evaluation of tools like Kubernetes and Istio, but the gap is definitely closing. Companies are closely watching the leaders in the space to see how they are implementing and what benefits and challenges they are facing. As more organizations successfully adopt these new technologies, it’s becoming obvious that while there is a skills gap and new complexity that must be accounted for, the outcomes around increased velocity, better resiliency and improved customer experience mandates that many organizations actively map their own path with microservices. This will help to ensure that they are not left behind by the market leaders in their space.

Interested in reading more articles like this?

Subscribe to our blog here.


Container orchestration

Going Beyond Container Orchestration

Every survey of late tells the same story about containers; organizations are not only adopting but embracing the technology. Most aren't relying on containers with the same degree of criticality as hyperscale organizations. That means they are one of the 85% of organizations IDC found in a Cisco-sponsored survey of over 8000 enterprises are using containers in production. That sounds impressive, but the scale at which they use them is limited. In a Forrester report commissioned by Dell EMC, Intel, and Red Hat, 63% of enterprises using containers have more than 100 instances running. 82% expect to be doing the same by 2019. That's a far cry from the hundreds of thousands in use by hyperscale technology companies.

And though the adoption rate is high, that's not to say that organizations haven't dabbled with containers only to abandon the effort. As with any (newish) technology, challenges exist. At the top of the list for containers are suspects you know and love: networking and management.

Some of the networking challenges are due to the functionality available in popular container orchestration environments like Kubernetes. Kubernetes supports microservices architectures through its service construct. This allows developers and operators to abstract the functionality of a set of pods and expose it as "a service" with access via a well-defined API. Kubernetes supports naming services as well as performing rudimentary layer 4 (TCP-based) load balancing.

The problem with layer 4 (TCP-based) load balancing is its inability to interact with layer 7 (application and API layers). This is true for any layer 4 load balancing; it's not something unique to containers and Kubernetes. Layer 4 offers visibility into connection level (TCP) protocols and metrics, but nothing more. That makes it difficult (impossible, really) to address higher-order problems such as layer 7 metrics like requests or transactions per second and the ability to split traffic (route requests) based on path. It also means you can't really do rate limiting at the API layer or support key capabilities like retries and circuit breaking.

The lack of these capabilities drives developers to encode them into each microservice instead. That results in operational code being included with business logic. This should cause some amount of discomfort, as it clearly violates the principles of microservice design. It's also expensive as it adds both architectural and technical debt to microservices.

Then there's management. While Kubernetes is especially adept at handling build and deploy challenges for containerized applications, it lacks key functionality needed to monitor and control microservice-based apps at runtime. Basic liveliness and health probes don't provide the granularity of metrics or the traceability needed for developers and operators to quickly and efficiently diagnose issues during execution. And getting developers to instrument microservices to generate consistent metrics can be a significant challenge, especially when time constraints are putting pressure on them to deliver customer-driven features.

These are two of the challenges a service mesh directly addresses: management and networking.

How Service Mesh Answers the Challenge

Both are more easily addressed by the implementation of a service mesh as a set of sidecar proxies. By plugging directly into the container environment, sidecar proxies enable transparent networking capabilities and consistent instrumentation. Because all traffic is effectively routed through the sidecar proxy, it can automatically generate and feed the metrics you need to the rest of the mesh. This is incredibly valuable for those organizations that are deploying traditional applications in a container environment. Legacy applications are unlikely to be instrumented for a modern environment. The use of a service mesh and its sidecar proxy basis enable those applications to emit the right metrics without requiring code to be added/modified.

It also means that you don't have to spend your time reconciling different metrics being generated by a variety of runtime agents. You can rely on one source of truth - the service mesh - to generate a consistent set of metrics across all applications and microservices.

Those metrics can include higher order data points that are fed into the mesh and enable more advanced networking to ensure fastest available responses to requests. Retry and circuit breaking is handled by the sidecar proxy in a service mesh, relieving the developer from the burden of introducing operational code into their microservices. Because the sidecar proxy is not constrained to layer 4 (TCP), it can support advanced message routing techniques that rely on access to layer 7 (application and API).

Container orchestration is a good foundation, but enterprise organizations need more than just a good foundation. They need the ability to interact with services at the upper layers of the stack, where metrics and modern architectural patterns are implemented today.

Both are best served by a service mesh. When you need to go beyond container orchestration, go service mesh.


Aspen Mesh Enterprise Service Mesh

Enabling the Financial Services Shift to Microservices

Financial services has historically been an industry riddled with barriers to entry. Challengers found it difficult to break through low margins and tightening regulations. However, large enterprises that once dominated the market are now facing disruption from smaller, leaner fintech companies that are eating away at the value chain. These disruptors are marked by technological agility, specialization and customer-centric UX. To remain competitive, financial services firms are reconsidering their cumbersome technical architectures and transforming them into something more adaptable. A recent survey of financial institutions found that ~85% consider their core technology to be too rigid and slow. Consequently, ~80% are expected to replace their core banking systems within the next five years.

Emerging regulations meant to address the new digital payment economy, such as PSD2 regulations in Europe, will require banks to adopt a new way to operate and deliver. Changes like PSD2 are aimed at bringing banking into the open API economy, driving interoperability and integration through open standards. To become a first class player in this new world of APIs, integration, and open data, financial services firms will need the advantages provided by microservices.

Microservices provide 3 key advantages for financial services

Enhanced Security

Modern fintech requirements create challenges to the established security infrastructure. Features like digital wallet, robo advisory and blockchain mandate the need for a new security mechanisms. Microservices follow a best practice of creating a separate identity service which addresses these new requirements.

Faster Delivery

Rapidly bringing new features to market is a cornerstone of successful fintech companies. Microservices make it easier for different application teams to independently deliver new functionality to meet emerging customer demands. Microservices also scale well to accommodate greater numbers of users and transactions..

Seamless Integration

The integration layer in a modern fintech solution needs a powerful set of APIs to communicate with other services, both internally and externally. This API layer is notoriously challenging to manage in a large monolithic application. Microservices make the API layer much easier to manage and secure through isolation, scalability and resilience.

Service mesh makes it easier to manage a complex microservice architecture

In the face of rapidly changing customer, business and regulatory requirements, microservices help financial services companies quickly respond to these changes.. But this doesn’t come for free. Companies take on increased operational overhead during the shift to microservices – technologies such as a service mesh can help manage that.

Service mesh provides a bundle of features around observability, security, and control that are crucial to managing microservices at scale. Previously existing solutions like DNS and configuration management provide some capabilities such as service discovery, but didn’t provide fast retries, load balancing, tracing and health monitoring. The old approach to managing microservices requires that you cobble together several different solutions each time a problem arises, but a service mesh bundles it all together in a reusable package. While it’s possible to accomplish some of what a service mesh manages with individual tools and processes, it’s manual and time consuming.

Competition from innovative fintech startups, along with ever increasing  customer expectations means established financial services players must change the way they deliver offerings and do business with their customers. Delivering on these new requirements is difficult with legacy systems. Financial services firms need a software architecture that’s fit for purpose – agile, adaptable, highly scalable, reliable and robust. Microservices make this possible, and a service mesh makes microservices manageable at scale.


Microservices challenges

How Service Mesh Addresses 3 Major Microservices Challenges

I was recently reading the Global Microservices Trends report by Dimensional Research and found myself thinking "a service mesh could help with that." So I thought I would cover those 3 challenges and how a service mesh addresses them. Respondents cited in the report make it clear microservices are gaining widespread adoption. It's also clear that along with the myriad of benefits they bring, there are also tough challenges that come as part of the package. The report shows:

91% of enterprises are using microservices or have plans to
99% of users report challenges with using microservices

Major Microservices Challenges

The report identifies a range of challenges companies are facing.

Companies are seeing a mix of technology and organizational challenges. I'll focus on the technological challenges a service mesh solves, but it's worth noting that one thing a service mesh does is bring uniformity so it's possible to achieve the same view across teams which can reduce the need for certain skills.

Each additional microservice increases the operational challenges

Not with a service mesh! A service mesh provides monitoring, scalability, and high availability through APIs instead of using discrete appliances. This flexible framework removes the operational complexity associated with modern applications. Infrastructure services were traditionally implemented as discrete appliances, which meant going to the actual appliance to get the service. Each appliance is unique which makes monitoring, scaling, and providing high availability for each appliance hard. A service mesh delivers these services inside the compute cluster itself through APIs and doesn’t require any additional appliances. Implementing a service mesh means adding new microservices doesn't have to add complexity.

It is harder to identify the root cause of performance issues

The service mesh toolbox gives you a couple of things that help solve this problem:

Distributed Tracing
Tracing provides service dependency analysis for different microservices and tracking for requests as they are traced through multiple microservices. It’s also a great way to identify performance bottlenecks and zoom into a particular request to define things like which microservice contributed to the latency of a request or which service created an error.

Metrics Collection
Another powerful thing you gain with service mesh is the ability to collect metrics. Metrics are key to understanding historically what has happened in your applications, and when they were healthy compared to when they were not. A service mesh can gather telemetry data from across the mesh and produce consistent metrics for every hop. This makes it easier to quickly solve problems and build more resilient applications in the future.

Differing development languages and frameworks

Another major challenge that report respondents noted facing was the challenge of maintaining a distributed architecture in a polyglot world. When making the move from monolith to microservices, many companies struggle with the reality that to make things work, they have to use different languages and tools. Large enterprises can be especially affected by this as they have many large, distributed teams. Service mesh provides uniformity by providing programming-language agnosticism, which addresses inconsistencies in a polyglot world where different teams, each with its own microservice, are likely to be using different programming languages and frameworks. A mesh also provides a uniform, application-wide point for introducing visibility and control into the application runtime, moving service communication out of the realm of implied infrastructure, to where it can be easily seen, monitored, managed and controlled.

Microservices are cool, but service mesh makes them ice cold. If you're on the microservices journey and are finding it difficult to manage the infrastructure challenges, a service mesh may be the right answer. Let us know if you have any questions on how to get the most out of service mesh, our engineering team is always available to talk.


Observability, or "Knowing What Your Microservices Are Doing"

Microservicin’ ain’t easy, but it’s necessary. Breaking your monolith down into smaller pieces is a must in a cloud native world, but it doesn’t automatically make everything easier. Some things actually become more difficult. An obvious area where it adds complexity is communications between services; observability into service to service communications can be hard to achieve, but is critical to building an optimized and resilient architecture.

The idea of monitoring has been around for a while, but observability has become increasingly important in a cloud native landscape. Monitoring aims to give an idea of the overall health of a system, while observability aims to provide insights into the behavior of systems. Observability is about data exposure and easy access to information which is critical when you need a way to see when communications fail, do not occur as expected or occur when they shouldn’t. The way services interact with each other at runtime needs to be monitored, managed and controlled. This begins with observability and the ability to understand the behavior of your microservice architecture.

A primary microservices challenges is trying to understand how individual pieces of the overall system are interacting. A single transaction can flow through many independently deployed microservices or pods, and discovering where performance bottlenecks have occurred provides valuable information.

It depends who you ask, but many considering or implementing a service mesh say that the number one feature they are looking for is observability. There are many other features a mesh provides, but those are for another blog. Here, I’m going to cover the top observability features provided by a service mesh.

Tracing

An overwhelmingly important things to know about your microservices architecture is specifically which microservices are involved in an end-user transaction. If many teams are deploying their dozens of microservices, all independently of one another, it’s difficult to understand the dependencies across your services. Service mesh provides uniformity which means tracing is programming-language agnostic, addressing inconsistencies in a polyglot world where different teams, each with its own microservice, can be using different programming languages and frameworks.

Distributed tracing is great for debugging and understanding your application’s behavior. The key to making sense of all the tracing data is being able to correlate spans from different microservices which are related to a single client request. To achieve this, all microservices in your application should propagate tracing headers. If you’re using a service mesh like Aspen Mesh, which is built on Istio, the ingress and sidecar proxies automatically add the appropriate tracing headers and reports the spans to a tracing collector backend. Istio provides distributed tracing out of the box making it easy to integrate tracing into your system. Propagating tracing headers in an application can provide nice hierarchical traces that graph the relationship between your microservices. This makes it easy to understand what is happening when your services interact and if there are any problems.

Metrics

A service mesh can gather telemetry data from across the mesh and produce consistent metrics for every hop. Deploying your service traffic through the mesh means you automatically collect metrics that are fine-grained and provide high level application information since they are reported for every service proxy. Telemetry is automatically collected from any service pod providing network and L7 protocol metrics. Service mesh metrics provide a consistent view by generating uniform metrics throughout. You don’t have to worry about reconciling different types of metrics emitted by various runtime agents, or add arbitrary agents to gather metrics for legacy apps. It’s also no longer necessary to rely on the development process to properly instrument the application to generate metrics. The service mesh sees all the traffic, even into and out of legacy “black box” services, and generates metrics for all of it.

Valuable metrics that a service mesh gathers and standardizes include:

  • Success Rates
  • Request Volume
  • Request Duration
  • Request Size
  • Request and Error Counts
  • Latency
  • HTTP Error Codes

These metrics make it simpler to understand what is going on across your architecture and how to optimize performance.

Most failures in the microservices space occur during the interactions between services, so a view into those transactions helps teams better manage architectures to avoid failures. Observability provided by a service mesh makes it much easier to see what is happening when your services interact with each other, making it easier to build a more efficient, resilient and secure microservice architecture.


Top 3 Reasons to Manage Microservices with Service Mesh


Building microservices is easy, operating a microservice architecture is hard. Many companies are successfully using tools like Kubernetes for deploys, but they still face runtime challenges. This is where the service mesh comes in. It greatly simplifies the managing of containerized applications and makes it easier to monitor and secure microservice-based applications. So what are the top 3 reasons to use a supported service mesh? Here’s my take.

Security

Since service mesh operates on a data plane, it’s possible to apply common security across the mesh which provides much greater security than multilayer environments like Kubernetes. A service mesh secures inter-service communications so you can know what a service is talking to and if that communication can be trusted.

Observability

Most failures in the microservices space occur during the interactions between services, so a view into those transactions helps teams better manage architectures to avoid failures. A service mesh provides a view into what is happening when your services interact with each other. The mesh also greatly improves tracing capabilities and provides the ability to add tracing without touching all of your applications.

Simplicity

A service mesh is not a new technology, rather a bundling together of several existing technologies in a package that makes managing the infrastructure layer much simpler. There are existing solutions that cover some of what a mesh does, take for example DNS. It’s a good way to do service discovery when you don’t care about the source trying to discover the service. If all you need in service discovery is to find the service and connect to it, DNS is sufficient, but it doesn’t give you fast retries or health monitoring. When you want to ask more advanced questions, you need a service mesh. You can cobble things together to address much of what a service mesh addresses, but why would you want to if you could just interact with a service mesh that provides a one-time, reusable packaging?

There are certainly many more advantages to managing microservices with a service mesh, but I think the above 3 are major selling points where organizations that are looking to scale their microservice architecture would find the greatest benefit. No doubt there will also be expanded capabilities in the future such as analytics dashboards that provide easy to consume insights from the huge amount of data in a service mesh. I’d love to hear other ideas you might have on top reasons to use service mesh, hit me up @zjory.