Installing Multicluster Aspen Mesh on KOPS Cluster

Installing Multicluster Aspen Mesh on KOPS Cluster

I recently tried installing Aspen Mesh on multicluster, and it was easier that I anticipated. In this post, I will walk you through my process. You can read the original version of this process here.

Firstly, ensure that you have two Kubernetes clusters with same version of Aspen Mesh installed on each of them (if you need an Aspen Mesh account, you can get a free 30-day trial here). Once you have an account, refer to the documentation for installing Aspen Mesh on your cluster.

kops get cluster

ssah-test1.dev.k8s.local        aws    us-west-2a
ssah-test2.dev.k8s.local        aws    us-west-2a

There are multiple ways to configure Aspen Mesh on a multicluster environment. In the following example, I have installed Aspen Mesh 1.9.1-am1 on both of my clusters, and the installation type is Multi-Primary on different network.

Pre-requisites for the Setup:

  • API: the server of each cluster must be able to access the API server of other cluster.
  • Trust: Trust must be established between all clusters in the mesh. This is achieved by having a common Root CA to generate intermediate certs for each clusters.

Configuring Trust:

I am creating an RSA type certificate for my root cert.  After I have downloaded and extracted the Aspen Mesh binary, I create a cert folder and add the folder to the directory stack.

mkdir -p certs
pushd certs

The binary downloaded should have a tools directory to create your certificate. You run the make command to create a root-ca folder, which will consist of four files: root-ca.conf, root-cert.csr, root-cert.pem and root-key.pem. For each of your clusters, you will need to generate an intermediate cert and key for Istio CA.

make -f ../tools/certs/Makefile.selfsigned.mk root-ca
make -f ../tools/certs/Makefile.selfsigned.mk cluster1-cacerts
make -f ../tools/certs/Makefile.selfsigned.mk cluster2-cacerts

You will then have to create secrets for each of your clusters in the istio-system namespace with all the input files that we generated from the last step. These secrets at each of the clusters is what configures trust between them as the same root-cert.pem is used to create the intermediate cert.

kubectl  create secret generic cacerts -n istio-system \\n  --from-file=ca-cert.pem \\n  --from-file=ca-key.pem \\n  --from-file=root-cert.pem \\n  --from-file=cert-chain.pem --context="${CTX_CLUSTER1}"

kubectl create secret generic cacerts -n istio-system \\n  --from-file=ca-cert.pem \\n  --from-file=ca-key.pem \\n  --from-file=root-cert.pem \\n  --from-file=cert-chain.pem --context="${CTX_CLUSTER2}"

Next, we will move on to the Aspen Mesh configuration, where we are enabling multicluster for istiod and giving names to the network and cluster. Add following fields in your override file which will be used during Helm installation/upgrade. Create a separate file for each cluster. You will also need to label the istio-system namespace in both of your clusters with appropriate label.

kubectl --context="${CTX_CLUSTER1}" label namespace istio-system topology.istio.io/network=network1

kubectl --context="${CTX_CLUSTER2}" label namespace istio-system topology.istio.io/network=network2

For Cluster 1

#Cluster 1

#In order to make the application service callable from any cluster, the DNS lookup must succeed in each cluster
#This provides DNS interception for all workloads with a sidecar, allowing Istio to perform DNS lookup on behalf of the application.
meshConfig:
  defaultConfig:
    proxyMetadata:
    # Enable Istio agent to handle DNS requests for known hosts
    # Unknown hosts will automatically be resolved using upstream dns servers in resolv.conf
      ISTIO_META_DNS_CAPTURE: "true"

global:
  meshID: mesh1
  multiCluster:
    # Set to true to connect two kubernetes clusters via their respective
    # ingressgateway services when pods in each cluster cannot directly
    # talk to one another. All clusters should be using Istio mTLS and must
    # have a shared root CA for this model to work.
    enabled: true
    # Should be set to the name of the cluster this installation will run in. This is required for sidecar injection
    # to properly label proxies
    clusterName: "cluster1"
    globalDomainSuffix: "local"
    # Enable envoy filter to translate `globalDomainSuffix` to cluster local suffix for cross cluster communication
    includeEnvoyFilter: false
  network: network1

For Cluster 2

#Cluster 2

#In order to make the application service callable from any cluster, the DNS lookup must succeed in each cluster
#This provides DNS interception for all workloads with a sidecar, allowing Istio to perform DNS lookup on behalf of the application.
meshConfig:
  defaultConfig:
    proxyMetadata:
    # Enable Istio agent to handle DNS requests for known hosts
    # Unknown hosts will automatically be resolved using upstream dns servers in resolv.conf
      ISTIO_META_DNS_CAPTURE: "true"

global:
  meshID: mesh1
  multiCluster:
    # Set to true to connect two kubernetes clusters via their respective
    # ingressgateway services when pods in each cluster cannot directly
    # talk to one another. All clusters should be using Istio mTLS and must
    # have a shared root CA for this model to work.
    enabled: true
    # Should be set to the name of the cluster this installation will run in. This is required for sidecar injection
    # to properly label proxies
    clusterName: "cluster2"
    globalDomainSuffix: "local"
    # Enable envoy filter to translate `globalDomainSuffix` to cluster local suffix for cross cluster communication
    includeEnvoyFilter: false
  network: network2

Now we will upgrade/install the istiod manifest with the newly added configuration from the override file. As you can see, I have separate override files for each cluster.

helm upgrade istiod manifests/charts/istio-control/istio-discovery -n istio-system --values sample_overrides-aspenmesh_2.yaml

helm upgrade istiod manifests/charts/istio-control/istio-discovery -n istio-system --values sample_overrides-aspenmesh.yaml

Check the pods in the istio-system namespace to see if all are in a running state. Be sure to delete all your application pods in your default namespace for the new configuration to kick in when the new pods will be spun. You can also check to see if the root cert used for pods in each cluster is the same. I am using pods from the bookinfo sample application.

istioctl pc secrets details-v1-79f774bdb9-pqpjw -o json | jq '[.dynamicActiveSecrets[] | select(.name == "ROOTCA")][0].secret.validationContext.trustedCa.inlineBytes' -r | base64 -d | openssl x509 -noout -text | md5

istioctl pc secrets details-v1-79c697d759-tw2l7 -o json | jq '[.dynamicActiveSecrets[] | select(.name == "ROOTCA")][0].secret.validationContext.trustedCa.inlineBytes' -r | base64 -d | openssl x509 -noout -text |md5

Once the istiod is upgraded, we will move on to creating an ingress gateway used for communication between two clusters via installing an east-west gateway. Use the configuration below to create a yaml file which will be used with Helm to install in each of the clusters. I have created two yaml files: cluster1_gateway_config.yaml and cluster2_gateway_config.yaml which will be used with respective clusters.

For Cluster 1

#This can be on separate override file as we will install a custom IGW
gateways:
  istio-ingressgateway:
    name: istio-eastwestgateway
    labels:
      app: istio-eastwestgateway
      istio: eastwestgateway
      topology.istio.io/network: network1
    ports:
    ## You can add custom gateway ports in user values overrides, but it must include those ports since helm replaces.
    # Note that AWS ELB will by default perform health checks on the first port
    # on this list. Setting this to the health check port will ensure that health
    # checks always work. https://github.com/istio/istio/issues/12503
    - port: 15021
      targetPort: 15021
      name: status-port
      protocol: TCP
    - port: 80
      targetPort: 8080
      name: http2
      protocol: TCP
    - port: 443
      targetPort: 8443
      name: https
      protocol: TCP
    - port: 15012
      targetPort: 15012
      name: tcp-istiod
      protocol: TCP
    # This is the port where sni routing happens
    - port: 15443
      targetPort: 15443
      name: tls
      protocol: TCP
    - name: tls-webhook
      port: 15017
      targetPort: 15017
    env:
      # A gateway with this mode ensures that pilot generates an additional
      # set of clusters for internal services but without Istio mTLS, to
      # enable cross cluster routing.
      ISTIO_META_ROUTER_MODE: "sni-dnat"
      ISTIO_META_REQUESTED_NETWORK_VIEW: "network1"
    serviceAnnotations:
      service.beta.kubernetes.io/aws-load-balancer-type: nlb

global:
  meshID: mesh1
  multiCluster:
    # Set to true to connect two kubernetes clusters via their respective
    # ingressgateway services when pods in each cluster cannot directly
    # talk to one another. All clusters should be using Istio mTLS and must
    # have a shared root CA for this model to work.
    enabled: true
    # Should be set to the name of the cluster this installation will run in. This is required for sidecar injection
    # to properly label proxies
    clusterName: "cluster1"
    globalDomainSuffix: "local"
    # Enable envoy filter to translate `globalDomainSuffix` to cluster local suffix for cross cluster communication
    includeEnvoyFilter: false
  network: network1

For Cluster 2

gateways:
  istio-ingressgateway:
    name: istio-eastwestgateway
    labels:
      app: istio-eastwestgateway
      istio: eastwestgateway
      topology.istio.io/network: network2
    ports:
    ## You can add custom gateway ports in user values overrides, but it must include those ports since helm replaces.
    # Note that AWS ELB will by default perform health checks on the first port
    # on this list. Setting this to the health check port will ensure that health
    # checks always work. https://github.com/istio/istio/issues/12503
    - port: 15021
      targetPort: 15021
      name: status-port
      protocol: TCP
    - port: 80
      targetPort: 8080
      name: http2
      protocol: TCP
    - port: 443
      targetPort: 8443
      name: https
      protocol: TCP
    - port: 15012
      targetPort: 15012
      name: tcp-istiod
      protocol: TCP
    # This is the port where sni routing happens
    - port: 15443
      targetPort: 15443
      name: tls
      protocol: TCP
    - name: tls-webhook
      port: 15017
      targetPort: 15017
    env:
      # A gateway with this mode ensures that pilot generates an additional
      # set of clusters for internal services but without Istio mTLS, to
      # enable cross cluster routing.
      ISTIO_META_ROUTER_MODE: "sni-dnat"
      ISTIO_META_REQUESTED_NETWORK_VIEW: "network2"
    serviceAnnotations:
      service.beta.kubernetes.io/aws-load-balancer-type: nlb
global:
  meshID: mesh1
  multiCluster:
    # Set to true to connect two kubernetes clusters via their respective
    # ingressgateway services when pods in each cluster cannot directly
    # talk to one another. All clusters should be using Istio mTLS and must
    # have a shared root CA for this model to work.
    enabled: true
    # Should be set to the name of the cluster this installation will run in. This is required for sidecar injection
    # to properly label proxies
    clusterName: "cluster2"
    globalDomainSuffix: "local"
    # Enable envoy filter to translate `globalDomainSuffix` to cluster local suffix for cross cluster communication
    includeEnvoyFilter: false
  network: network2
helm install istio-eastwestgateway manifests/charts/gateways/istio-ingress --namespace istio-system --values cluster1_gateway_config.yaml

helm install istio-eastwestgateway manifests/charts/gateways/istio-ingress --namespace istio-system --values cluster2_gateway_config.yaml

After adding the new east-west gateway, you will get an east-west gateway pod deployed in the istio-system namespace and the service which creates a Network Load Balancer specified in the annotations. You will need to resolve the IP address of the NLBs for the east-west gateways and then patch them into the service as spec.externalIPs in both of your clusters, until Multi-Cluster/Multi-Network – Cannot use a hostname-based gateway for east-west traffic · Issue #29359 · istio/istio  is fixed. This is not an ideal situation because of the following reasons.

k get svc -n istio-system istio-eastwestgateway
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP                                                                                  PORT(S)                                                                                      AGE
istio-eastwestgateway   LoadBalancer   100.71.211.32   a927e6<TRUNCATED>.elb.us-west-2.amazonaws.com 15021:32138/TCP,80:30420/TCP,443:31450/TCP,15012:30150/TCP,15443:30476/TCP,15017:32335/TCP   8d

nslookup a927e6<TRUNCATED>.elb.us-west-2.amazonaws.com
Server:        172.23.241.180
Address:    172.23.241.180#53
Non-authoritative answer:
Name:    a927e6<TRUNCATED>.elb.us-west-2.amazonaws.com
Address: 35.X.X.X

kubectl patch svc -n istio-system istio-eastwestgateway -p '{"spec":{"externalIPs": ["35.X.X.X"]}}'

k get svc -n istio-system istio-eastwestgateway
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP                                                                                  PORT(S)                                                                                      AGE
istio-eastwestgateway   LoadBalancer   100.71.211.32   a927e6<TRUNCATED>.elb.us-west-2.amazonaws.com,35.X.X.X   15021:32138/TCP,80:30420/TCP,443:31450/TCP,15012:30150/TCP,15443:30476/TCP,15017:32335/TCP   8d

Now that the gateway is configured to communicate, you will have to make sure the API of each cluster is able to talk to the other cluster. You can do this in AWS by making sure API instances are accessible to each other by creating specific rules for their security group. We will then need to create a secret in cluster 1 that provides access to cluster 2’s API server and vice versa for endpoint discovery.

#Enable endpoint discovery Cluster 2
istioctl x create-remote-secret --context="${CTX_CLUSTER1}" --name=cluster1 |kubectl apply -f - --context="${CTX_CLUSTER2}"

#Enable endpoint discovery Cluster 1
istioctl x create-remote-secret --context="${CTX_CLUSTER2}" --name=cluster2 |kubectl apply -f - --context="${CTX_CLUSTER1}"

At this stage, the pilot (which is bundled in istiod binary) should have the new configuration, and when you tail the logs for the pod, you should be able see the log's message “Number of remote cluster: 1”. With this version, you also would need to edit the ingress east-west gateway in the istio-system namespace that we created above as the selector label and the annotation added via Helm chart is different than expected. It shows “istio: ingressgateway” but should be “istio: eastwestgateway”. You can now create pods in each cluster and verify it is working as expected. Here is how the east-west gateway should look:

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  annotations:
    meta.helm.sh/release-name: istio-eastwestgateway
    meta.helm.sh/release-namespace: istio-system
  creationTimestamp: "2021-05-13T01:56:50Z"
  generation: 2
  labels:
    app: istio-eastwestgateway
    app.kubernetes.io/managed-by: Helm
    install.operator.istio.io/owning-resource: unknown
    istio: eastwestgateway
    istio.io/rev: default
    operator.istio.io/component: IngressGateways
    release: istio-eastwestgateway
    topology.istio.io/network: network2
  name: istio-multicluster-ingressgateway
  namespace: istio-system
  resourceVersion: "6777467"
  selfLink: /apis/networking.istio.io/v1beta1/namespaces/istio-system/gateways/istio-multicluster-ingressgateway
  uid: 618b2b5b-a2bb-4b37-a4a1-7f5ab7ef03d4
spec:
  selector:
    istio: eastwestgateway
  servers:
  - hosts:
    - '*.local'
    port:
      name: tls
      number: 15443
      protocol: TLS
    tls:
      mode: AUTO_PASSTHROUGH



Improve your application with service mesh

Improving Your Application with Service Mesh

Engineering + Technology = Uptime 

Have you come across the term “application value” lately? Software-first organizations are using it as a new form of currency. Businesses delivering a product or service to its customers through an application understand the growing importance of their application’s security, reliability and feature velocity. And, as applications that people use become increasingly important to enterprises, so do engineering teams and the right tools 

The Right People for the Job: Efficient Engineering Teams 

Access to engineering talent is now more important to some companies than access to capital. 61% of executives consider this a potential threat to their business. With the average developer spending more than 17 hours each week dealing with maintenance issues, such as debugging and refactoring, plus approximately four hours a week on “bad code” (representing nearly $85 billion worldwide in opportunity cost lost annually), the necessity of driving business value with applications increases. And who is it that can help to solve these puzzles? The right engineering team, in combination with the right technologies and tools. Regarding the piece of the puzzle that can solved by your engineering team, enterprises have two options as customer demands on applications increase:  

  1. Increase the size and cost of engineering teams, or  
  2. Increase your engineering efficiency.  

Couple the need to increase the efficiency of your engineering team with the challenges around growing revenue in increasingly competitive and low margin businessesand the importance of driving value through applications is top of mind for any business. One way to help make your team more efficient is by providing the right technologies and tools. 

The Right Technology for the Job: Microservices and Service Mesh 

Using microservices architectures allows enterprises to more quickly deliver new features to customers, keeping them happy and providing them with more value over timeIn addition, with microservices, businesses can more easily keep pace with the competition in their space through better application scalability, resiliency and agility. Of course, as with any shift in technology, there can be new challenges.  

One challenge our customers sometimes face is difficulty with debugging or resolving problems within these microservices environments. It can be challenging to fix issues fast, especially when there are cascading failures that can cause your users to have a bad experience on your applicationThat’s where a service mesh can help. 

Service mesh provides ways to see, identify, trace and log when errors occurred and pinpoint their sources. It brings all of your data together into a single source of truth, removing error-prone processes, and enabling you to get fast, reliable information around downtime, failures and outages. More uptime means happy users and more revenue, and the agility with stability that you need for a competitive edge. 

Increasing Your Application Value  

Service mesh allows engineering teams to address many issues, but especially these three critical areas: 

  • Proactive issue detection, quick incident response, and workflows that accelerate fixing issues 
  • A unified source of multi-dimensional insights into application and infrastructure health and performance that provides context about the entire software system 
  • Line of sight into weak points in environments, enabling engineering teams to build more resilient systems in the future  

If you or your team are running Kubernetes-based applications at scale and are seeing the advantages, but know you can get more value out of them by increasing your engineering efficiency and uptime for your application's’ users, it’s probably time to check out a service mesh. You can reach out to the Aspen Mesh team on how to easily get started or how to best integrate service mesh into your existing stack at hello@aspenmesh.io. Or you can get started yourself with a 30-day free trial of Aspen Mesh.