Virtual Host Routing with Logical DNS Names

Let me describe a common service mesh scenario...

You've deployed your application and it is happily consuming some external resources on the 'net. For example, say that reviews.default.svc.cluster.local is communicating with external service redis-12.eu-n-3.example.com. But you need to switch to a new external service redis-db-4.eu-n-1.example.com. You're using a service mesh, right? The light bulb goes on — how about we just redirect all traffic from redis-12.eu-n-3.example.com to redis-db-4.eu-n-1.example.com? That certainly will work; add or modify a few resources and voila, traffic is re-routed and with zero downtime!

Only now there's a new problem — your system is looking less like the tidy cluster you started with and more like a bowl of spaghetti!

What if we used a neutral name for the database? How about db.default.svc.cluster.local? We might start with the same mechanism for re-routing traffic: from db.default.svc.cluster.local to redis-12.eu-n-3.example.com. Then when we needed to make the above change we just need to update the configuration to route traffic from db.default.svc.cluster.local to redis-db-4.eu-n-1.example.com. Done and again with zero downtime!

This is Virtual Host Routing to a Logical DNS Name. Virtual Host Routing is traditionally a server-side concept — a server responding to requests for one or more virtual servers. With a service mesh, it's fairly common to also apply this routing to the client side, redirecting traffic destined for one service to another service.

To give you a bit more context, a "logical name" is defined as a placeholder name that is mapped to a physical name when a request is made. An application might be configured to talk to its database at db.default.svc.cluster.local which is then mapped to redis-12.eu-n-3.example.com in one cluster and redis-db-4.eu-n-1.example.com in another.

Common practice is to use configuration to supply the DNS names to an application (add a DB_HOST environment variable set directly to redis-12.eu-n-3.example.com or redis-db-4.eu-n-1.example.com). By setting the configuration to a physical server, it's harder to redirect the traffic later.

Best Practices

What are some best practices for working with external services? Processes like restricting outbound traffic and TLS origination can have a significant impact. The best practices listed below are not required, but this post is written assuming these practices are being followed.

Restricting Outbound Traffic

The outbound traffic policy determines if external services must be declared. A common setting for this policy is ALLOW_ANY — any application running in your cluster can communicate to any external service.  We recommend that the outbound traffic policy is set to REGISTRY_ONLY which requires that external services are defined explicitly. For security, the Aspen Mesh distribution of Istio has REGISTRY_ONLY by default.

If you are using an Istio distribution or if you want to explicitly set the outbound traffic policy, restrict outbound traffic by adding the following to your values file when deploying the istio chart:

global:
  outboundTrafficPolicy:
    mode: REGISTRY_ONLY

TLS Origination

If an application communicates directly over HTTPS to upstream services, the service mesh can't inspect the traffic and it has no idea if requests are failing (it's all just encrypted traffic to the service mesh). The proxy is just routing bits. By having the proxy do "TLS origination", the service mesh sees both requests and responses and can even do some intelligent routing based on the content of the requests.

We'll use the rest of this blog to step through how to configure your application to communicate over just HTTP (change https://... configuration to just http://...).

How to Set Up Virtual Host Routing to a Logical DNS Name

Service

A logical DNS name must still be resolvable. Otherwise the service mesh won't attempt to route traffic to it. In the yaml below, we are defining a DNS name of httpbin.default.svc.cluster.local so that we can route traffic to it.

apiVersion: v1
kind: Service
metadata:
  name: httpbin
spec:
  ports:
  - port: 443
    name: https
  - port: 80
    name: http

ServiceEntry

A service entry indicates that we have services running in our cluster that need to communicate to the outside Internet. The actual host (physical name) is listed (httpbin.org in this example). Note that because we have the proxy doing TLS origination (just plain http between the application and the proxy), port 443 lists a protocol of HTTP (instead of HTTPS).

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: httpbin
spec:
  hosts:
  - httpbin.org
  ports:
  - number: 443
    name: http-port-for-tls-origination
    protocol: HTTP
  resolution: DNS
  location: MESH_EXTERNAL

VirtualService

A virtual service defines a set of rules to apply when traffic is routed to a specific host. In this example when traffic is routed to the /foo endpoint of httpbin.default.svc.cluster.local, the following rules are applied:

  1. Rewrite the URI from /foo to /get
  2. Rewrite the HOST header from httpbin.default.svc.cluster.local to httpbin.org
  3. Re-route the traffic to httpbin.org

Note that just re-routing the traffic is not sufficient for the server to handle our requests. The HOST header is how a server understands how to process a request.

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
spec:
  hosts:
  - httpbin.default.svc.cluster.local
  http:
  - match:
    - uri:
        prefix: /foo
    rewrite:
      uri: /get
      authority: httpbin.org
    route:
    - destination:
        host: httpbin.org
        port:
          number: 443

DestinationRule

A destination rule defines policies that are applied to traffic after routing has occurred. In this case we define policies for traffic going to port 443 of httpbin.org. The above configuration is routing plain HTTP traffic to port 443. The following destination rule indicates that this traffic should be sent over HTTPS via TLS (the proxy will do TLS origination).

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: httpbin
spec:
  host: httpbin.org
  trafficPolicy:
    loadBalancer:
      simple: ROUND_ROBIN
    portLevelSettings:
    - port:
        number: 443
      tls:
        mode: SIMPLE # initiates HTTPS when accessing httpbin.org

Testing with a simple pod

That's it! You can now deploy a service and configure it to talk to http://httpbin.default.svc.cluster.local/foo and traffic will get re-routed to https://httpbin.org/get. Let's test it out...

1. Create a pod (just for testing; typically you use deployments to create and manage pods):

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test-container
    image: pstauffer/curl
command: ["/bin/sleep", "3650d"]
$ kubectl apply -f pod.yaml

The above pod just sleeps for 10 years. Not very interesting by itself but it also provides the curl command that we can use for testing.

2. Curl the logical name:

$ kubectl exec -c test-container test-pod -it -- \
    curl -v http://httpbin.default.svc.cluster.local/foo

Here is the expected output (the response body was removed for brevity):

*   Trying 100.66.72.128...
* TCP_NODELAY set
* Connected to httpbin.virtual-host-routing.svc.cluster.local (100.66.72.128) port 80 (#0)
> GET /foo HTTP/1.1
> Host: httpbin.virtual-host-routing.svc.cluster.local
> User-Agent: curl/7.60.0
> Accept: */*
>
< HTTP/1.1 200 OK
< date: Mon, 23 Sep 2019 22:04:08 GMT
< content-type: application/json
< content-length: 916
< x-amzn-requestid: 1743ed99-df5b-41c2-aa46-9662e10be674
< cache-control: public, max-age=86400
< x-envoy-upstream-service-time: 219
< server: envoy
<
* Connection #0 to host httpbin.virtual-host-routing.svc.cluster.local left intact
...

And with that, you should be set!

Virtual Host Routing to a Logical DNS Name can be a useful tool, allowing a server to communicate with external services without needing to specify the physical DNS name of the external service. And a service mesh makes it easy, enhancing your capabilities and keeping things rational (no offense to spaghetti lovers!).

If you enjoyed learning about (and trying out!) this topic, subscribe to our blog to get updates when new articles are posted.


Using D3 in React: A Pattern for Using Data Visualization at Scale

Data visualization is an important part of what we do at Aspen Mesh. When you implement a service mesh, it provides a huge trove of data about your services. Everything you need to know about how your services are communicating is available, but separating the signal from the noise is essential. Data visualization is a powerful tool to distill complex data sets into simple, actionable visuals. To build these visualizations, we use React and D3. React is great for managing a large application and organizing code into discrete components to keep you sane. D3 is magical for visuals on large data sets. Unfortunately, much of the usefulness of each library is lost and bugs are easy to come by when they are not conscientiously put together. In the following article, I will detail the pattern that I work with to build straight forward D3 based visualization components that fit easily within a large scale React application.

Evaluating React + D3 Patterns

The difficulty in putting both libraries together is each has its own way of manipulating the DOM. React through JSX and its Virtual DOM and D3 through .append(). The simplest way to combine them is to let them do their own thing in isolation and act as black boxes to each other. I did not like this approach because it felt like jamming a separate D3 application inside of our existing React app. The code was structured differently, it had to be tested differently and it was difficult to use existing React components and event handlers. I kept researching and playing around with the code until I came on a pattern that addresses those issues. It enables React to track everything in the Virtual DOM and still allows D3 to do what it does best.

Key Aspects:

  • Allow React to handle entering and exiting elements so it can keep track of everything in the Virtual DOM.
  • Code structured and testsed the same way as the rest of the React app.
  • Utilize React lifecycle methods and key attribute to emulate D3 data joins.
  • Manipulate and update element attributes through D3 by selecting the React ref object.
  • D3 for all the tough math. Scales, axes, transitions.

To illustrate the pattern, I will build out a bar graph that accepts an updating data set and transitions between them. The chart has an x axis based on date, and a y axis based on a numerical value. Each data points looks like this:

interface Data {
  id: number;
  date: string;
  value: number;
}

I'll focus on the core components, but to run it and see it all working together, check out the git repo.

SVG Component

The root element to our chart is an SVG. The SVG is rendered through JSX, and subsequent chart elements, such as the axes and the bar elements, are passed in as child components. The SVG is responsible for setting the size and margin and dictating that to its child elements. It also creates scales based on the data and available size and passing the scales down. The SVG component can handle resizing as well as binding D3 panning and zooming functionality. I won't illustrate zooming and panning here, but if you're interested, check out this component. The basic SVG component looks like this.

interface SVGProps {
  svgHeight: number;
  svgWidth: number;
  data: Data[];
}


export default class Svg extends React.Component<SVGProps> {
  render() {
    const { svgHeight, svgWidth, data } = this.props;

    const margin = { top: 20, right: 20, bottom: 30, left: 40 };
    const width = svgWidth - margin.left - margin.right;
    const height = svgHeight - margin.top - margin.bottom;

    const xScale = d3
      .scaleBand()
      .range([0, width])
      .padding(0.1);

    const yScale = d3.scaleLinear().range([height, 0]);

    xScale.domain(data.map(d => d.date));
    yScale.domain([0, d3.max(data, d => d.value) || 0]);

    const axisBottomProps = {
      height,
      scale: xScale
    };
    const axisLeftProps = { scale: yScale };

    const barProps = {
      height,
      width,
      xScale,
      yScale,
      data
    };

    return (
      <svg height={svgHeight} width={svgWidth}>
        <g transform={`translate(${margin.left},${margin.top})`}>
          <AxisBottom {...axisBottomProps} />
          <AxisLeft {...axisLeftProps} />
          <Bars {...barProps} />
        </g>
      </svg>
    );
  }
}

Axes

There are two axis components for the left and bottom axes, and they receive the corresponding D3 scale object as a prop. The React ref object is key to linking up D3 and React. React will render the element, and so keep track of it in the Virtual DOM, and will then pass the ref to D3 so it can manage all the complex attribute math. On componentDidMount call, D3 selects the ref object and then calls the corresponding axis function on it, building the axis. On componentDidUpdate, the axis is redrawn with the updated scale after a D3 transition() to give it a smooth animation. The bottom axis component looks as follows:

interface AxisProps {
  scale: d3.ScaleLinear<any, any>;
}

export default class Axis extends React.Component<AxisProps> {
  ref: React.RefObject<SVGGElement>;

  constructor(props: AxisProps) {
    super(props);
    this.ref = React.createRef();
  }

  componentDidMount() {
    if (this.ref.current) {
      d3.select(this.ref.current).call(d3.axisLeft(this.props.scale));
    }
  }

  componentDidUpdate() {
    if (this.ref.current) {
      d3.select(this.ref.current)
        .transition()
        .call(d3.axisLeft(this.props.scale));
    }
  }

  render() {
    return <g ref={this.ref} />;
  }
}

Rendering with Bars with React Data Joins

The Bars element illustrates how to emulate D3's data join functionality through React lifecycle methods and its key attribute. Data joins allow us to map DOM elements to specific data points and to recognize when those data points enter our set, exit, or are updated by a change in the data set. It is a powerful way to visually represent data constancy between changing data sets. It also allows us to update our chart and only redraw elements that change instead of redrawing the entire graph. Using D3 data joins, with the .enter() or .exit() methods, requires us to append elements through D3 outside of React's Virtual DOM and generally ruins everything. To get around this limitation, we can instead mimic D3 data joins through React's lifecycle methods and its own diffing algorithm. The functions that would be run on .enter() can be executed inside of componentDidMount, updates in componentDidUpdate, and .exit() in componentWillUnmount. Running transitions in componentWillUnmount requires using React Transitions to delay the element from being removed from the DOM until the transition has run. The necessary element for React to map an element to a data point, in this case a bar to a number and a date, is the component's key attribute. By making the key attribute a unique value for each data point, React can recognize through its diffing algorithm if that element needs to be added in, removed, or just updated based on the data point it represents. The key attribute works exactly the same as the key function that would be passed to D3's .data() function.

In this example, two components are created to render the bars on the chart. The first component, Bars, will map over each data point and a render a corresponding Bar component. It binds each data point to the Bar component through the datum prop and assigns a unique key attribute, in this case, the data points unique id.

interface BarsProps {
  data: Data[];
  height: number;
  width: number;
  xScale: d3.ScaleBand<any>;
  yScale: d3.ScaleLinear<any, any>;
}

class Bars extends React.Component<BarsProps> {
  render() {
    const { data, height, width, xScale, yScale } = this.props;
    const barProps = {
      height,
      width,
      xScale,
      yScale
    };
    const bars = data.map(datum => {
      return <Bar key={datum.id} {...barProps} datum={datum} />;
    });
    return <g className="bars">{bars}</g>;
  }
}

The Bar component renders a <rect /> element and passes the ref object to D3 in its lifecycle methods. The lifecycle methods then operate on the element's attributes in familiar D3 dot notation.

interface BarProps {
  datum: Data;
  height: number;
  width: number;
  xScale: d3.ScaleBand<any>;
  yScale: d3.ScaleLinear<any, any>;
}

class Bar extends React.Component<BarProps> {
  ref: React.RefObject<SVGRectElement>;

  constructor(props: BarProps) {
    super(props);
    this.ref = React.createRef();
  }

  componentDidMount() {
    const { height, datum, yScale, xScale } = this.props;

    d3.select(this.ref.current)
      .attr("x", xScale(datum.date) || 0)
      .attr("y", yScale(datum.value) || 0)
      .attr("fill", "green")
      .attr("height", 0)
      .transition()
      .attr("height", height - yScale(datum.value));
  }

  componentDidUpdate() {
    const { datum, xScale, yScale, height } = this.props;
    d3.select(this.ref.current)
      .attr("fill", "blue")
      .transition()
      .attr("x", xScale(datum.date) || 0)
      .attr("y", yScale(datum.value) || 0)
      .attr("height", height - yScale(datum.value));
  }

  render() {
    const { xScale } = this.props;
    const attributes = {
      width: xScale.bandwidth()
    };
    return <rect data-testid="bar" {...attributes} ref={this.ref} />;
  }
}

Testing

By rendering everything through the React Virtual DOM, we can run tests on it with the same setup as we would test our other components. This test setup checks that each data point is represented as a bar in the SVG. Two data points are given intially, and then the component is rerendered with only one of the data points. We test that there are two green bars from the initial mount. Then we test that the update is applied correctly and we only have a single blue bar.

import React from "react";
import "jest-dom/extend-expect";
import { render } from "react-testing-library";
import Svg from "../Svg";

it("renders a bar for each data point", () => {
  const svgHeight = 500;
  const svgWidth = 500;
  const data = [
    { id: 1, date: "9/19/2018", value: 1 },
    { id: 2, date: "11/23/2018", value: 33 }
  ];

  const barProps = {
    svgHeight,
    svgWidth,
    data
  };

  const barProps2 = {
    ...barProps,
    data: [data[0]]
  };

  const { rerender, getAllByTestId, getByTestId } = render(
    <Svg {...barProps} />
  );
  expect(getAllByTestId("bar").length).toBe(2);
  expect(getByTestId("bar")).toHaveAttribute("fill", "green");

  rerender(<Svg {...barProps2} />);

  expect(getAllByTestId("bar").length).toBe(1);
  expect(getByTestId("bar")).toHaveAttribute("fill", "blue");
});

I like this pattern a lot. It fits really nicely into the existing production React app and it allows for recognizable code patterns by encouraging building components for each element in a D3 visualization. It's a smaller learning curve for React developers to building large amounts of D3 and we can use existing display components and event systems within D3 managed visualizations. By allowing D3 to manage the attributes, we can still use advanced features like transitions animations, panning and zooming.

Creating a UI for a service mesh requires managing a lot of complex data and then representing that data in intuitive ways. By combining React and D3 judiciously, we can allow React to do what it does best and manage large application state and then let D3 shine by creating excellent visualizations.

If you want to check out what the final product looks like, check out the Aspen Mesh beta. It's free and easy to sign up for.


Inline yaml Editing with yq

So you're working hard at building a solid Kubernetes cluster — maybe using kops to create a new instance group and BAM you are presented with an editor session to edit the details of that shiny new instance group. No biggie; you just need to add a simple little detailedInstanceMonitoring: true to the spec and you are good to go.

Okay, now you need to do this several times a day to test the performance of the latest build and this is just one of several steps to get the cluster up and running. You want to automate building that cluster as much as possible but every time you get to the step to create that instance group, BAM there it is again — your favorite editor, you have to add that same line every time.

Standard practice is to use cluster templating but there are times when you need something more lightweight. Enter yq.

yq is great for digging through yaml files but it also has an in-place merge function that can modify a file directly just like any editor. And kops, along with several other command line tools honor the EDITOR environment variable so you can automate your yaml editing along with the rest of your cluster handy work.

Making it work

The first roadblock is that you can pass command line options via the EDITOR environment variable but the file being edited in-place must be the last option (actually passed to the editor by kops as it invokes the editor). yq wants you to pass it the file to be edited followed by a patch file with instructions on editing the file (more on that below). To get around this issue I use a little bash script to invoke yq and reorder the last two command line options like so (I'll call the file yq-merge-editor.sh):

#!/usr/bin/env bash

if [[ $# != 2 ]]; then
    echo "Usage: $0 <merge file (supplied by script)> <file being edited (supplied by invoker of EDITOR)>"
    exit 1
fi

yq merge --inplace --overwrite $2 $1

In the above script, the merge option tells yq we want to merge yaml files and --inplace says to edit the first file in-place. The --overwrite option instructs yq to overwrite existing sections of the file if they are defined in the merge file. $2 is the file to be edited and $1 is the merge file (the opposite order of what the script gets them in). There are other useful options available documented in the yq merge documentation.

Example 1: Turning on detailed instance monitoring

The next step is to create a patch file containing the edit you want to perform. In this example, we will turn on detailed instance monitoring which is a useful way to get more metrics from your nodes. Here's the merge file (we will call this file ig-monitoring.yaml):

spec:
  detailedInstanceMonitoring: true

To put it all together, you can invoke kops with a custom editor command:

EDITOR="./yq-merge-editor.sh ./ig-monitoring.yaml" kops edit instancegroups nodes

That's it! kops creates a temporary file and invokes your editor script which invokes yq. yq edits the temporary file in-place and kops takes the edited output and moves on.

Example 2: Temporarily add nodes

Say you want to temporarily add capacity to your cluster while performing some maintenance. This is a temporary change, so there's no need to update your cluster's configuration permanently. The following patch file will update the min and max node counts in an instance group:

spec:
  maxSize: 25
  minSize: 25

Then invoke the same script from above followed by a kops update:

EDITOR="./yq-merge-editor.sh ig-nodes-25.yaml" kops edit instancegroups nodes
kops update cluster $NAME --yes

These tips should make it easier to build lots of happy clusters!


Using AWS Services from Istio Service Mesh with Go

This is a quick blog on how we use AWS services from inside of an Istio Service Mesh. Why does it matter that you’re inside the mesh? Because the service mesh wants to manage all the traffic in/out of your application. This means it needs to be able to inspect the traffic and parse it if it is HTTP. Nothing too fancy here, just writing it down in case it can save you a few keystrokes.

Our example is for programs written in Go.

Step 1: Define an Egress Rule

You need to make an egress to allow the application to talk to the AWS service at all. Here’s an example egress rule to allow dynamo:

Step 2: Delegate Encryption to the Sidecar

This part is the trick we were missing. If you want to get maximum service mesh benefits, you need to pass unencrypted traffic to the sidecar. The sidecar will inspect it, apply policy and encrypt it before egressing to the AWS service (in our case Dynamo).

Don’t worry, your traffic is not going out on any real wires unencrypted. Only the loopback wire from your app container to the sidecar. In Kubernetes, this is its own network namespace so even other containers on the same system cannot see it unencrypted.

package awswrapper
import (
"net/http"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/endpoints"
"github.com/aws/aws-sdk-go/aws/request"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/golang/glog"
"github.com/you/repo/pkg/tracing"
)
type Config struct {
InMesh bool
Endpoint string // http://dynamodb.us-west-2.amazonaws.com
Label string // Used in logging messages to identify
}
func istioEgressEPResolver(service, region string, optFns ...func(*endpoints.Options)) (endpoints.ResolvedEndpoint, error) {
ep, err := endpoints.DefaultResolver().EndpointFor(service, region, optFns...)
if err != nil {
return ep, err
}
ep.URL = ep.URL + ":443"
return ep, nil
}
func AwsConfig(cfg Config) *aws.Config {
config := aws.NewConfig().
WithEndpoint(cfg.Endpoint)
if cfg.InMesh {
glog.Infof("Using http for AWS for %s", cfg.Label)
config = config.WithDisableSSL(true).
WithEndpointResolver(endpoints.ResolverFunc(istioEgressEPResolver))
}
return config
}
func AwsSession(label string, cfg *aws.Config) (*session.Session, error) {
sess, err := session.NewSession(cfg)
if err != nil {
return nil, err
}
// This has to be the first handler before core.SendHandler which
// performs the operation of sending request over the wire.
// Note that Send Handler is used which are invoked after the signing of
// request is completed which means Tracing headers would not be signed.
// Signing of tracing headers causes request failures as Istio changes the
// headers and signature validation fails.
sess.Handlers.Send.PushFront(addTracingHeaders)
sess.Handlers.Send.PushBack(func(r *request.Request) {
glog.V(6).Infof("%s: %s %s://%s%s",
label,
r.HTTPRequest.Method,
r.HTTPRequest.URL.Scheme,
r.HTTPRequest.URL.Host,
r.HTTPRequest.URL.Path,
)
})
// This handler is added after core.SendHandler so that the tracing headers
// can be removed. This is required in case of retries, the request is signed
// again and if the request headers contain Tracing headers retry signature
// validation will fail as Istio will update these headers.
sess.Handlers.Send.PushBack(removeTracingHeaders)
return sess, nil
}
view rawwrapAwsSession.go hosted with ❤ by GitHub

AwsConfig() is the core - you need to make a new aws.Session with these options.

The first option, WithDisableSSL(true), tells the AWS libraries to not use HTTPS and instead just speak plain HTTP. This is very bad if you are not in the mesh. But, since we are in the mesh, we’re only going to speak plain HTTP over to the sidecar, which will convert HTTP into HTTPS and send it out over the wire. In Kubernetes, the sidecar is in an isolated network namespace with your app pod, so there’s no chance for other pods or processes to snoop this plaintext traffic.

When you set the first option, the library will try to talk to http://dynamodb.<region>.us-west-2.amazonaws.com on port 80 (hey, you asked it to disable SSL). But that’s not what we want - we want to act like we’re talking to 443 so that the right egress rule gets invoked and the sidecar encrypts traffic. That’s what istioEgressEPResolver is for.

We do it this way for a little bit of belts-and-suspenders safety - we really want to avoid ever accidentally speaking HTTP to dynamo. Here are the various failure scenarios:

  • Our service is in Istio, and the user properly configured InMesh=true: everything works and is HTTPS via the sidecar.
  • Our service is not in Istio, and the user properly configured InMesh=false: everything works and is HTTPS via the AWS go library.
  • Our service is not in Istio, but oops! the user set InMesh=true: the initial request goes out to dynamo on port 443 as plain HTTP. Dynamo rejects it, so we know it’s broken before sending a bunch of data via plain HTTP.
  • Our service is in Istio, but oops! the user set InMesh=false: the sidecar rejects the traffic as it is already-encrypted HTTPS that it can’t make any sense of.

OK, now you’ve got an aws.Session instance ready to go. Pass it to your favorite AWS service interface and go:

import (
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/dynamodb"
"github.com/you/repo/pkg/awswrapper"
)
type Dynamo struct {
Session *session.Session
Db *dynamodb.DynamoDB
}
func NewWithConfig(cfg *aws.Config) (*Dynamo, error) {
sess, err := awswrapper.AwsSession("Test", cfg)
if err != nil {
return nil, err
}
dyn := &Dynamo{
Session: sess,
Db: dynamodb.New(sess),
}
return dyn, nil
}
view rawuseWrapped.go hosted with ❤ by GitHub

p.s. What’s up with addTracingHeaders() and removeTracingHeaders()? Check out Neeraj’s post. While you’re at it, you can add just a few more lines and get great end-to-end distributed tracing.


Distributed tracing with Istio in AWS

Everybody loves tracing! Am I right? If you attended KubeCon (my bad, CloudNativeCon!) 2017 at Austin or saw any of the keynotes posted online, you would have noticed the recurring theme explaining the benefits of tracing, especially as it relates to DevOps tools. Istio and service mesh were hot topics and many sessions discussed how Istio provides distributed tracing out of the box making it easier for application developers to integrate tracing into their system.

Indeed, a great benefit of using service mesh is getting more visibility and understanding of your applications. Since this is a tech post (I remember categorizing it as such) let’s dig deeper in how Istio provides application tracing.

When using Istio, a sidecar envoy proxy is automatically injected next to your applications (in Kubernetes this means adding containers to the application Pod). This sidecar proxy intercepts all traffic and can add/augment tracing headers to the requests entering/leaving the application container. Additionally, the sidecar proxy also handles asynchronous reporting of spans to the tracing backends like Jaeger, Zipkin, etc. Sounds pretty awesome!

One thing that the applications do need to implement is propagating tracing headers from incoming to outgoing requests as mentioned in this Istio guide. Simple enough right? Well it’s about to get interesting.

Before we proceed further, first a little background on why I’m writing this blog. We, here at Aspen Mesh offer a supported enterprise service mesh built on open source Istio. Not only do we offer a service mesh product but we also use it in our production SaaS platform hosted in AWS (isn’t that something?).

I was tasked with propagating tracing headers in our applications so that we get nice hierarchical traces graphing the relationship between our microservices. As we are hosted in AWS, many of our microservices make outgoing requests to AWS services. During this exercise, I found some interesting interactions between adding tracing headers and using Istio with AWS services that I decided to share my experience. This blog describes various iterations I went through to get it all working together.

The application in question for this post is a simple web server. When it receives a HTTP request it makes an outbound DynamoDB query to fetch an item. As it is deployed in the Istio service mesh, the sidecar proxy automatically adds tracing headers to the incoming request. I wanted to propagate the tracing headers from the incoming request to the DynamoDB query request for getting all the traces tied together.

First Iteration

In order to achieve this I decided to pass a custom function as request options to the AWS DynamoDB API which allows you to augment request headers before they are transmitted over the wire. In the snippet below I’m using the AWS go-sdk’s dynamo.GetItemWithContext for fetching an item and passing AddTracingHeaders as the request.Option. Note that the AddTracingHeaders method uses standard opentracing API for injecting headers from a input context.

func AddTracingHeaders() awsrequest.Option {
  return func(req *awsrequest.Request) {
    if span := ot.SpanFromContext(req.Context()); span != nil {
      ot.GlobalTracer().Inject(
      span.Context(),
      ot.HTTPHeaders,
      ot.HTTPHeadersCarrier(req.HTTPRequest.Header))
    }
  }
}

// ctx is the incoming request's context as received from the mesh
func makeDynamoQuery(ctx context.Context ) {
  // Note that AddTracingHeaders is passed as awsrequest.Option
  result, err := dynamo.GetItemWithContext(ctx, ..., AddTracingHeaders())
  // Do something with result
}

Ok, time for testing this solution! The new version compiles, and I verified locally that it is able to fetch items from DynamoDB. After deploying the new version in production with Istio (sidecar injected) I’m hoping to see the traces nicely tied together. Indeed, the traces look much better but wait all of the responses from DynamoDB are now HTTP Status Code 400. Bummer!

Looking at the error messages from aws-go-sdk we are getting AccessDeniedException which according to AWS docs indicate that the signature is not valid. Adding tracing headers seems to have broken signature validation which is odd, yet interesting as I had tested in my dev environment (without sidecar proxy) and the DynamoDB requests worked fine, but in production it stopped working. Typical developer nightmare!

Digging into the AWS sdk package, I found that the client code signs every request including headers with a few hardcoded exceptions. The difference between the earlier and the new version is the addition of tracing headers to the request which are now getting signed and then handed to the sidecar proxy. Istio’s sidecar proxy (in this case Envoy) changes these tracing headers (as it should!) before sending it to DynamoDB service which breaks the signature validation at the server.

To get this fixed we need to ensure that the tracing headers are added after the request is signed but before it is sent out by the AWS sdk. This is getting more complicated, but still doable.

Second Iteration

I couldn’t find an easy way to whitelist these tracing headers and prevent them from getting them signed. But, AWS session package provides a very flexible API for adding custom handlers which get invoked in various stages of the request lifecycle. Additionally, providing a session handler has the benefit of being added in all AWS service requests (not just DynamoDB) which use that session. Perfect!

Here’s the AddTracingHeaders method above added as a session handler:

sess, err := session.NewSession(cfg)

// Add the AddTracingHeaders as the first Send handler. This is important as one
// of the default Send handlers does the work of sending the request.
sess.Handlers.Send.PushFront(AddTracingHeaders)

This looks promising. Testing showed that the first request to the AWS DynamoDB service is successful (200 Ok!) Traces look good too! We are getting somewhere, time to test some failure scenarios.

I added a Istio fault injection rule to return a HTTP 500 error on outgoing DynamoDB requests to exercise the AWS sdk’s retry logic. Snap! receiving the HTTP Status Code 400 with AccessDeniedException error again on every retry.

Looking at the AWS request send logic, it appears that on retriable errors the code makes a copy of the previous request, signs it and then invokes the Send handlers. This means that on retries the previously added tracing headers would get signed again (i.e. earlier problem is back, hence 400s) and then the AddTracingHeaders handler would add back the tracing headers.

Now that we understand the issue, the solution we came up with is to add the tracing headers after the request is signed and before it is sent out just like the earlier implementation. In addition, to make retries work we now need to remove these headers after the request is sent so that the resigning and reinvocation of AddTracingHeaders is handled correctly.

Final Interation

Here’s what the final working version looks like:

func injectFromContextIntoHeader(ctx context.Context, header http.Header) {
  if span := ot.SpanFromContext(ctx); span != nil {
    ot.GlobalTracer().Inject(
    span.Context(),
    ot.HTTPHeaders,
    ot.HTTPHeadersCarrier(header))
  }
}

func AddTracingHeaders() awsrequest.Option {
  return func(req *awsrequest.Request) {
    injectFromContextIntoHeader(req.Context(), req.HTTPRequest.Header)
  }
}

// This is a bit odd, inject tracing headers into an empty header map so that
// we can remove them from the request.
func RemoveTracingHeaders(req *awsrequest.Request) {
  header := http.Header{}
  injectFromContextIntoHeader(req.Context(), header)
  for k := range header {
    req.HTTPRequest.Header.Del(k)
  }
}

sess, err := session.NewSession(cfg)

// Add the AddTracingHeaders as the first Send handler.
sess.Handlers.Send.PushFront(AddTracingHeaders)

// Pushback is used here so that this handler is added after the request has
// been sent.
sess.Handlers.Send.PushBack(RemoveTracingHeaders)

Agreed, above solution looks far from elegant but it does work. I hope this post helps if you are in a similar situation.

If you have a better solution feel free to reach out to me at neeraj@aspenmesh.io


Building Istio with Minikube-in-a-Container and Jenkins

AspenMesh provides a supported distribution of Istio, which means that we need to be able to test and release bugfixes even if they are out-of-cadence with the upstream Istio project. To do this we’ve developed our own build and test infrastructure. Now that we’ve got many of these pieces up and running, we figured some parts might be useful if you are also interested in CI for Istio but not committed to Circle CI or GKE.

This post will show how we made an updated Minikube-in-a-Container and a Jenkins pipeline that uses it to build and test Istio. If you want, you can docker run the minikube container right now and get a functioning Kubernetes cluster inside the container that you can throw away when you’re done. The Jenkins bits will help you build Istio today and also give you a head-start if you want to build containers inside of containers.

Minikube-in-a-Container

This part describes how we made a Minikube-in-a-container that we use to run the Istio smoke tests during a build. This isn’t our idea - we started with localkube-dind. We couldn’t get it working out-of-the-box, we think due to a little bit of drift between localkube and minikube, so this is a record of what we changed to get it working for us. We also added some options and tooling so that we can use Istio in the resulting container. Nothing too fancy but we’re hoping it gives you a head start if you’re heading down a similar path.

Minikube may be familiar to you as a project to start up your own Kubernetes cluster in a VM that you can carry around on your laptop. This approach is very convenient but there are some situations where you can’t/don’t want to provision a VM, like cloud providers that don’t offer nested virtualization. Since docker can now run inside of docker, we decided to try making our own Kubernetes cluster inside of a docker container. An ephemeral Kubernetes container is easy to start, run a few tests, and throw away when you’re done and is a good fit for CI.

In our model, the Kubernetes cluster creates child docker containers (not sibling containers in the lingo of Jérôme Petazzoni’s consideration ). We did this intentionally - we preferred the isolation of child containers over sharing the docker build cache. But you should check out Jérôme’s article before committing to DinD for your application - maybe DooD (Docker-outside-of-Docker) is better for you. FYI - we’ve avoided the “it gets worse” part, and it looks like the “bad” and “ugly” parts are fixed/avoidable for us.

When you start a docker container, you’re asking docker to create and setup a few namespaces in the kernel, and then start your container inside these namespaces. A namespace is a sandbox - when you’re inside the namespace, you can generally only see other things that are also inside the namespace. A chroot, but for more than just filesystems - PIDs, network interfaces, etc. If you start a docker container with --privileged then the namespaces that are created get extra privileges, like the ability to create more child namespaces. That’s the trick at the core of docker-in-docker. For any more details, again, Jérôme’s the expert - check out his explanation (complete with Xzibit memes) here.

OK, so here’s the flow:

  1. Build a container that’s got docker, minikube, kubectl and dependencies installed.
  2. Add a “fake-systemctl” shim to trick Minikube into running without a real systemd installation.
  3. Start the container with --privileged
  4. Have the container start its own “inner” dockerd - this is the DinD part.
  5. Have the container start minikube --vm-driver=none so that minikube (in the container) talks to the dockerd running right alongside it.

All you have to do is docker run --privileged this container and you’re ready to go with kubectl. If you want, you can run the kubectl inside the container and get a truly throw-away environment.

You can try it now:

docker run --privileged --rm -it quay.io/aspenmesh/minikube-dind
docker exec -it <container> /bin/bash
# kubectl get nodes
<....>
# kubectl create -f https://k8s.io/docs/tasks/debug-application-cluster/shell-demo.yaml
# kubectl exec -it shell-demo -- /bin/bash

when you exit, the --rm flag means that docker will tear down and throw away everything for you.

For heavier usage, you’ll probably want to “docker cp” the kubeconfig file to your host and talk to kubernetes inside the container over the exposed kube API port 8443.

Here’s the Dockerfile that makes it go (you can clone this and support scripts here):

# Portions Copyright 2016 The Kubernetes Authors All rights reserved.
# Portions Copyright 2018 AspenMesh
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Based on:
# https://github.com/kubernetes/minikube/tree/master/deploy/docker/localkube-dind
FROM debian:jessie
# Install minikube dependencies
RUN DEBIAN_FRONTEND=noninteractive apt-get update -y && \
DEBIAN_FRONTEND=noninteractive apt-get -yy -q --no-install-recommends install \
iptables \
ebtables \
ethtool \
ca-certificates \
conntrack \
socat \
git \
nfs-common \
glusterfs-client \
cifs-utils \
apt-transport-https \
ca-certificates \
curl \
gnupg2 \
software-properties-common \
bridge-utils \
ipcalc \
aufs-tools \
sudo \
&& DEBIAN_FRONTEND=noninteractive apt-get clean && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
# Install docker
RUN \
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - && \
apt-key export "9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88" | gpg - && \
echo "deb [arch=amd64] https://download.docker.com/linux/debian jessie stable" >> \
/etc/apt/sources.list.d/docker.list && \
DEBIAN_FRONTEND=noninteractive apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get -yy -q --no-install-recommends install \
docker-ce \
&& DEBIAN_FRONTEND=noninteractive apt-get clean && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
VOLUME /var/lib/docker
EXPOSE 2375
# Install minikube
RUN curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.24.1/minikube-linux-amd64 && chmod +x minikube
ENV MINIKUBE_WANTUPDATENOTIFICATION=false
ENV MINIKUBE_WANTREPORTERRORPROMPT=false
ENV CHANGE_MINIKUBE_NONE_USER=true
# minikube --vm-driver=none checks systemctl before starting. Instead of
# setting up a real systemd environment, install this shim to tell minikube
# what it wants to know: localkube isn't started yet.
COPY fake-systemctl.sh /usr/local/bin/systemctl
EXPOSE 8443
# Install kubectl
RUN curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.9.1/bin/linux/amd64/kubectl && \
chmod a+x kubectl && \
mv kubectl /usr/local/bin
# Copy local start.sh
COPY start.sh /start.sh
RUN chmod a+x /start.sh
# If nothing else specified, start up docker and kubernetes.
CMD /start.sh & sleep 4 && tail -F /var/log/docker.log /var/log/dind.log /var/log/minikube-start.log
view rawDockerfile.minikube hosted with ❤ by GitHub

Jenkins for Istio

Now that we’ve got Kubernetes-in-a-container we can use this for our Istio builds. Dockerized build systems are nice because developers can quickly create higher fidelity replicas of the CI build. Here’s an outline of our CI architecture for Istio builds:

  • Jenkins worker: This is a VM started by Jenkins for running builds. It may be shared by other builds at the same time. It’s important that any tooling we install on the worker is locally-scoped (so it doesn’t interfere with other builds) and ephemeral (we autoscale Jenkins workers to save costs).
  • Minikube container: The first thing we do is build and enter the Minikube container we talked about above. The rest of the build proceeds inside this container (or its children). The Jenkins workspace is mounted here. Jenkins’ docker plugin takes care of tearing this container down in success or failure, which is all we need to clean up all the running Kubernetes and Istio components.
  • Builder container: This is a container with build tools like the golang toolchain installed. It’s where we compile Istio and build containers for the Istio components. We test those components in the minikube container, and if they pass, declare the build a success and push the containers to our registry.

Most of the Jenkinsfile is about getting those pieces set up. After that, we run the same steps to build Istio that you would on your laptop: make dependmake buildmake test.

Check out the Jenkinsfile here:

node('docker') {
properties([disableConcurrentBuilds()])
wkdir = "src/istio.io/istio"
stage('Checkout') {
checkout scm
}
// withRegistry writes to /home/ubuntu/.dockercfg outside of the container
// (even if you run it inside the docker plugin) which won't be visible
// inside the builder container, so copy them somewhere that will be
// visible. We will symlink to .dockercfg only when needed to reduce
// the chance of accidentally using the credentials outside of push
docker.withRegistry('https://quay.io', 'name-of-your-credentials-in-jenkins') {
stage('Load Push Credentials') {
sh "cp ~/.dockercfg ${pwd()}/.dockercfg-quay-creds"
}
}
k8sImage = docker.build(
"k8s-${env.BUILD_TAG}",
"-f $wkdir/.jenkins/Dockerfile.minikube " +
"$wkdir/.jenkins/"
)
k8sImage.withRun('--privileged') { k8s ->
stage('Get kubeconfig') {
sh "docker exec ${k8s.id} /bin/bash -c \"while ! [ -e /kubeconfig ]; do echo waiting for kubeconfig; sleep 3; done\""
sh "rm -f ${pwd()}/kubeconfig && docker cp ${k8s.id}:/kubeconfig ${pwd()}/kubeconfig"
// Replace "127.0.0.1" with the path that peer containers can use to
// get to minikube.
// minikube will bake certs including the subject "kubernetes" so
// the kube-api server needs to be reachable from the client's concept
// of "https://kubernetes:8443" or kubectl will refuse to connect.
sh "sed -i'' -e 's;server: https://127.0.0.1:8443;server: https://kubernetes:8443;' kubeconfig"
}
builder = docker.build(
"istio-builder-${env.BUILD_TAG}",
"-f $wkdir/.jenkins/Dockerfile.jenkins-build " +
"--build-arg UID=`id -u` --build-arg GID=`id -g` " +
"$wkdir/.jenkins",
)
builder.inside(
"-e GOPATH=${pwd()} " +
"-e HOME=${pwd()} " +
"-e PATH=${pwd()}/bin:\$PATH " +
"-e KUBECONFIG=${pwd()}/kubeconfig " +
"-e DOCKER_HOST=\"tcp://kubernetes:2375\" " +
"--link ${k8s.id}:kubernetes"
) {
stage('Check') {
sh "ls -al"
// If there are old credentials from a previous build, destroy them -
// we will only load them when needed in the push stage
sh "rm -f ~/.dockercfg"
sh "cd $wkdir && go get -u github.com/golang/lint/golint"
sh "cd $wkdir && make check"
}
stage('Build') {
sh "cd $wkdir && make depend"
sh "cd $wkdir && make build"
}
stage('Test') {
sh "cp kubeconfig $wkdir/pilot/platform/kube/config"
sh """PROXYVERSION=\$(grep envoy-debug $wkdir/pilot/docker/Dockerfile.proxy_debug |cut -d: -f2) &&
PROXY=debug-\$PROXYVERSION &&
curl -Lo - https://storage.googleapis.com/istio-build/proxy/envoy-\$PROXY.tar.gz | tar xz &&
mv usr/local/bin/envoy ${pwd()}/bin/envoy &&
rm -r usr/"""
sh "cd $wkdir && make test"
}
stage('Push') {
sh "cd && ln -sf .dockercfg-quay-creds .dockercfg"
sh "cd $wkdir && " +
"make HUB=yourhub TAG=$BUILD_TAG push"
gitTag = getTag(wkdir)
if (gitTag) {
sh "cd $wkdir && " +
"make HUB=yourhub TAG=$gitTag push"
}
sh "cd && rm .dockercfg"
}
}
}
}
String getTag(String wkdir) {
return sh(
script: "cd $wkdir && " +
"git describe --exact-match --tags \$GIT_COMMIT || true",
returnStdout: true
).trim()
}
view rawJenkinsfile hosted with ❤ by GitHub

If you want to grab the files from this post and the supporting scripts, go here.