Recommended Remediation for Kubernetes CVE-2019-11247

A Kubernetes vulnerability CVE-2019-11247 was announced this week.  While this is a vulnerability in Kubernetes itself, and not Istio, it may affect Aspen Mesh users. This blog will help you understand possible consequences and how to remediate.

  • You should mitigate this vulnerability by updating to Kubernetes 1.13.9, 1.14.5 or 1.15.2 as soon as possible.
  • This vulnerability affects the interaction of Roles (a Kubernetes RBAC resource) and CustomResources (Istio uses CustomResources for things like VirtualServices, Gateways, DestinationRules and more).
  • Only certain definitions of Roles are vulnerable.
  • Aspen Mesh's installation does not define any such Roles, but you might have defined them for other software in your cluster.
  • More explanation and recovery details below.

Explanation

If you have a Role defined anywhere in your cluster with a "*" for resources or apiGroups, then anything that can use that Role could escalate to modify many CustomResources.

This Kubernetes issue and this helpful blog have extensive details and an example walkthrough that we'll summarize here.  Kubernetes Roles define sets of permissions for resources in one particular namespace (like "default").  They are not supposed to define permissions for resources in other namespaces or resources that live globally (outside of any namespace); that's what ClusterRoles are for.

Here's an example Role from Aspen Mesh that says "The IngressGateway should be able to get, watch or list any secrets in the istio-system namespace", which it needs to bootstrap secret discovery to get keys for TLS or mTLS:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: istio-ingressgateway-sds
  namespace: istio-system
rules:
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "watch", "list"]

This Role does not grant permissions to secrets in any other namespace, or grant permissions to anything aside from secrets.   You'd need additional Roles and RoleBindings to add those. It doesn't grant permissions to get or modify cluster-wide resources.  You'd need ClusterRoles and ClusterRoleBindings to add those.

The vulnerability is that if the Role grants access to CustomResources in one namespace, it accidentally grants access to the same kinds of CustomResources that exist at global scope.  This is not exploitable in many cases because if you have a namespace-scoped CustomResource called "Foo", you can't also have a global CustomResource called "Foo", so there are no global-scope Foos to attack.  Unfortunately, if your role allows access to a resource "*" or apiGroup "*", then the "*" matches both namespace-scoped Foo and globally-scoped Bar in vulnerable versions of Kubernetes. This Role could be used to attack Bar.

If you're really scrutinizing the above example, note that the apiGroup "" is different than the apiGroup "*": the empty "" refers to core Kubernetes resources like Secrets, while "*" is a wildcard meaning any.

Aspen Mesh and Istio define three globally-scoped CustomResources: ClusterRbacConfig, MeshPolicy, ClusterIssuer.  If you had a vulnerable Role defined and an attacker could assume that Role, then they could have modified those three CustomResources.  Aspen Mesh does not provide any vulnerable Roles so a cluster would need to have those Roles defined for some other purpose.

Recovery

First and as soon as possible, you should upgrade Kubernetes to a non-vulnerable version: 1.13.9, 1.14.5 or 1.15.2.

When you define Roles and ClusterRoles, you should follow the Principle of Least Privilege and avoid granting access to "*" - always prefer listing the specific resources required.

You can examine your cluster for vulnerable Roles.  If any exist, or have existed, those could have been exploited by an attacker with knowledge of this vulnerability to modify Aspen Mesh configuration.  To mitigate this, recreate any CustomResources after upgrading to a non-vulnerable version of Kubernetes.

This snippet will print out any vulnerable Roles currently configured, but cannot tell you if any may have existed in the past.  It relies on the jq tool:

kubectl get role --all-namespaces -o json |jq '.items[] | select(.rules[].resources | index("*"))'
kubectl get role --all-namespaces -o json |jq '.items[] | select(.rules[].apiGroups | index("*"))'

This is not a vulnerability in Aspen Mesh, so there is no need to upgrade Aspen Mesh.


To Multicluster, or Not to Multicluster: Solving Kubernetes Multicluster Challenges with a Service Mesh

If you are going to be running multiple clusters for dev and organizational reasons, it’s important to understand your requirements and decide whether you want to connect these in a multicluster environment and, if so, to understand various approaches and associated tradeoffs with each option.

Kubernetes has become the container orchestration standard, and many organizations are currently running multiples clusters. But while communication issues within clusters are largely solved, communication across clusters is still a major challenge for most organizations.

Service mesh helps to address multicluster challenges. Start by identifying what you want, then shift to how to get it. We recommend understanding your specific communication use case, identifying your goals, then creating an implementation plan.

Multicluster offers a number of benefits:

  • Single pane of glass
  • Unified trust domain
  • Independent fault domains
  • Intercluster traffic
  • Heterogenous/non-flat network

Which can be achieved with various approaches:

  • Independent clusters
  • Common management
  • Cluster-aware service routing through gateways
  • Flat network
  • Split-horizon Endpoints Discovery Service (EDS)

If you have decided to multicluster, your next move is deciding the best implementation method and approach for your organization. A service mesh like Istio can help, and when used properly can make multicluster communication painless.

Read the full article here on InfoQ’s site.

 


Running Stateful Apps with Service Mesh: Kubernetes Cassandra with Istio mTLS Enabled

Cassandra is a popular, heavy-load, highly performant, distributed NoSQL database.  It is fully integrated into many mainstay cloud and cloud-native architectures. At companies such as Netflix and Spotify, Cassandra clusters provide continuous availability, fault tolerance, resiliency and scalability.

Critical and sensitive data is sent to and from a Cassandra database.  When deployed in a Kubernetes environment, ensuring the data is secure and encrypted is a must.  Understanding data patterns and performance latencies across nodes becomes essential, as your Cassandra environment spans multiple datacenters and cloud vendors.

A service mesh provides service visibility, distributed tracing, and mTLS encryption.  

While it’s true Cassandra provides its own TLS encryption, one of the compelling features of Istio is the ability to uniformly administer mTLS for all of your services.  With a service mesh, you can set up an easy and consistent policy where Istio automatically manages the certificate rotation. Pulling Cassandra into a service mesh pairs capabilities of the two technologies in a way that makes running stateless services much easier.

In this blog, I’ll cover the steps necessary to configure Istio with mTLS enabled in a Kubernetes Cassandra environment.  We’ve collected some information from the Istio community, did some testing ourselves and pieced together a workable solution.  One of the benefits you get with Aspen Mesh is our Istio expertise from running Istio in production for the past 18 months.  We are tightly engaged with the Istio community and continually testing and working out the kinks of upstream Istio. We’re here to help you with your service mesh path to production!

Let’s consider how Cassandra operates.  To achieve continuous availability, Cassandra uses a “ring” communication approach.  Meaning each node communicates continually with the other existing nodes. For Cassandra’s node consensus, the nodes send metadata to several nodes through a service called a Gossip.  The receiving nodes then “gossip” to all the additional nodes. This Gossip protocol is similar to a TCP three-way handshake, and all of the metadata, like heartbeat state, node status, location, etc… is messaged across nodes via IP address:port.

In a Kubernetes deployment, Cassandra nodes are deployed as StatefulSets to ensure the allocated number of Cassandra nodes are available at all times. Persistent volumes are associated with the Cassandra StatefulSets, and a headless service is created to ensure a stable network ID.  This allows Kubernetes to restart a pod on another node and transfer its state seamlessly to the new node.

Now, here’s where it gets tricky.  When implementing an Istio service mesh with mTLS enabled, the Envoy sidecar intercepts all of the traffic from the Cassandra nodes, verifies where it’s coming from, decrypts and sends the payload to the Cassandra pod through an internal loopback address.   The Cassandra nodes are all listening on their Pod IPs for gossip. However, Envoy is forwarding only to 127.0.0.1, where Cassandra isn't listening. Let’s walk through how to solve this issue.

Setting up the Mesh:

We used the cassandra:v13 image from the Google repo for our Kubernetes Cassandra environment. There are a few things you’ll need to ensure are included in the Cassandra manifest at the time of deployment.  Within the Cassandra service, you'll need to set it to a headless service, or set clusterIP: None, and you have to allow some additional ports/port-names that Cassandra service will need to communicate with:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: cassandra
  namespace: cassandra
  name: cassandra
spec:
  clusterIP: None
  ports:
  - name: tcp-client
    port: 9042
  - port: 7000
    name: tcp-intra-node
  - port: 7001
    name: tcp-tls-intra-node
  - port: 7199
    name: tcp-jmx
  selector:
    app: cassandra

The next step is to tell each Cassandra node to listen to the Envoy loopback address.  

This image, by default, sets Cassandra’s listener to the Kubernetes Pod IP.  The listener address will need to be set to the localhost loopback address. This allows the Envoy sidecar to pass communication through to the Cassandra nodes.

To enable this you will need to change the config file for Cassandra or the cassandra.yaml.

We did this by adding a substitution to our Kubernetes Cassandra manifest based on the Istio bug:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  namespace: cassandra
  name: cassandra
  labels:
    app: cassandra
spec:
  serviceName: cassandra
  replicas: 3
  selector:
    matchLabels:
      app: cassandra
  template:
    metadata:
      labels:
        app: cassandra
    spec:
      terminationGracePeriodSeconds: 1800
      containers:
      - name: cassandra
        image: gcr.io/google-samples/cassandra:v13
        command: [ "/usr/bin/dumb-init", "/bin/bash", "-c", "sed -i 's/^CASSANDRA_LISTEN_ADDRESS=.*/CASSANDRA_LISTEN_ADDRESS=\"127.0.0.1\"/' /run.sh && /run.sh" ]
        imagePullPolicy: Always
        ports:
        - containerPort: 7000
          name: intra-node
        - containerPort: 7001
          name: tls-intra-node
        - containerPort: 7199
          name: jmx
        - containerPort: 9042

This simple change uses sed to patch the cassandra startup script to listen on localhost.  

If you're not using the google-samples/cassandra container you should modify your Cassandra config or container to set the listen_address to 127.0.0.1.  For some containers, this may already be the default.

You'll need to remove any ServiceEntry or VirtualService resources associated with the Cassandra deployment as no additional specified routing entries or rules are necessary.  Nothing external is needed to communicate, Cassandra is now inside the mesh and communication will simply pass through to each node.

Since the clusterIP is set to none for the Cassandra Service will be configured as a headless service (i.e. setting the clusterIP: None) a DestinationRule does not need to be added.  When there is no clusterIP assigned, Istio defines load balancing mode as PASSTHROUGH by default.

If you are using Aspen Mesh, the global meshpolicy has mTLS enabled by default, so no changes are necessary.

$ kubectl edit meshpolicy default -o yaml
apiVersion: authentication.istio.io/v1alpha1
kind: MeshPolicy
.
. #edited out
.
spec:
  peers:
  - mtls: {}

Finally, create a Cassandra namespace, enable automatic sidecar injection and deploy Cassandra.

$ kubectl create namespace cassandra
$ kubectl label namespace cassandra istio-injection=enabled
$ kubectl -n cassandra apply -f <Cassandra-manifest>.yaml

Here is the output that shows the Cassandra nodes running with Istio sidecars.

$ kubectl get pods -n cassandra                                                                                   
NAME                     READY     STATUS    RESTARTS   AGE
cassandra-0              2/2       Running   0          22m
cassandra-1              2/2       Running   0          21m
cassandra-2              2/2       Running   0          20m
cqlsh-5d648594cb-86rq9   2/2       Running   0          2h

Here is the output validating mTLS is enabled.

$ istioctl authn tls-check cassandra.cassandra.svc.cluster.local

 
HOST:PORT           STATUS     SERVER     CLIENT     AUTHN POLICY     DESTINATION RULE
cassandra...:7000       OK       mTLS       mTLS         default/ default/istio-system

Here is the output validating the Cassandra nodes are communicating with each other and able to establish load-balancing policies.

$ kubectl exec -it -n cassandra cassandra-0 -c cassandra -- nodetool status
Datacenter: DC1-K8Demo
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load     Tokens  Owns (effective)  Host ID            Rack
UN  100.96.1.225  129.92 KiB  32   71.8%       f65e8c93-85d7-4b8b-ae82-66f26b36d5fd Rack1-K8Demo
UN  100.96.3.51   157.68 KiB  32   55.4%       57679164-f95f-45f2-a0d6-856c62874620  Rack1-K8Demo
UN  100.96.4.59   142.07 KiB  32   72.8%       cc4d56c7-9931-4a9b-8d6a-d7db8c4ea67b  Rack1-K8Demo

If this is a solution that can make things easier in your environment, sign up for the free Aspen Mesh Beta.  It will guide you through an automated Istio installation, then you can install Cassandra using the manifest covered in this blog, which can be found here.   


Using Kubernetes RBAC to Control Global Configuration In Istio

Why Configuration Matters

If I'm going to get an error for my code, I like to get an error as soon as possible.  Unit test failures are better than integration test failures. I prefer compiler errors to unit test failures - that's what makes TypeScript great.  Going even further, a syntax highlighter is a very proximate feedback of errors - if that keyword doesn't turn green, I can fix the spelling with almost no conscious burden.

Shifting from coding to configuration, I like narrow configuration systems that make it easy to specify a valid configuration and tell me as quickly as possible when I've done something wrong.  In fact, my dream configuration specification wouldn't allow me to specify something invalid. Remember Newspeak from 1984?  A language so narrow that expressing ungood thoughts become impossible.  Apparently, I like my programming and configuration languages to be dystopic and Orwellian.

If you can't further narrow the language of your configuration without giving away core functionality, the next best option is to narrow the scope of configuration.  I think about this in reverse - if I am observing some behavior from the system and say to myself, "Huh, that's weird", how much configuration do I have to look at before I understand?  It's great if this is a small and expanding ring - look at config that's very local to this particular object, then the next layer up (in Kubernetes, maybe the rest of the namespace), and so on to what is hopefully a very small set of global config.  Ever tried to debug a program with global variables all over the place? Not fun.

My three principles of ideal config:

  1. Narrow Like Newspeak, don't allow me to even think of invalid configuration.
  2. Scope Only let me affect a small set of things associated with my role.
  3. Time Tell me as early as possible if it's broken.

Declarative config is readily narrow.  The core philosophy of declarative config is saying what you want, not how you get it.  For example, "we'll meet at the Denver Zoo at noon" is declarative. If instead I specify driving directions to the Denver Zoo, I'm taking a much more imperative approach.  What if you want to bike there? What if there is road construction and a detour is required? The value of declarative config is that if we focus on what we want, instead of how to get it, it's easier for me to bring my controller (my car's GPS) and you to bring yours (Google Maps in the Bike setting).

On the other hand, a big part of configuration is pulling together a bunch of disparate pieces of information together at the last moment, from a bunch of different roles (think humans, other configuration systems and controllers) just before the system actually starts running.  Some amount of flexibility is required here.

Does Cloud Native Get Us To Better Configuration?

I think a key reason for the popularity of Kubernetes is that it has a great syntax for specifying what a healthy, running microservice looks like.  Its syntax is powerful enough in all the right places to be practical for infrastructure.

Service meshes like Istio robustly connect all the microservices running in your cluster.  They can adaptively route L7 traffic, provide end-to-end mTLS based encryption, and provide circuit breaking and fault injection.  The long feature list is great, but it's not surprising that the result is a somewhat complex set of configuration resources. It's a natural result of the need for powerful syntax to support disparate use cases coupled with rapid development.

Enabling Fine-grained RBAC with Traffic Claim Enforcer

At Aspen Mesh, we found users (including ourselves) spending too much time understanding misconfiguration.  The first way we addressed that problem was with Istio Vet, which is designed to warn you of probably incorrect or incomplete config, and provide guidance to fix it.  Sometimes we know enough that we can prevent the misconfiguration by refusing to allow it in the first place.  For some Istio config resources, we do that using a solution we call Traffic Claim Enforcer.

There are four Istio configuration resources that have global implications: VirtualService, Gateway, ServiceEntry and DestinationRule.  Whenever you create one of these resources, you create it in a particular namespace. They can affect how traffic flows through the service mesh to any target they specify, even if that target isn't in the current namespace.  This surfaces a scope anti-pattern - if I'm observing weird behavior for some service, I have to examine potentially all DestinationRules in the entire Kubernetes cluster to understand why.

That might work in the lab, but we found it to be a serious problem for applications running in production.  Not only is it hard to understand the current config state of the system, it's also easy to break. It’s important to have guardrails that make it so the worst thing I can mess up when deploying my tiny microservice is my tiny microservice.  I don't want the power to mess up anything else, thank you very much. I don't want sudo. My platform lead really doesn't want me to have sudo.

Traffic Claim Enforcer is an admission webhook that waits for a user to configure one of those resources with global implications, and before allowing will check:

  1. Does the resource have a narrow scope that affects only local things?
  2. Is there a TrafficClaim that grants the resource the broader scope requested?

A TrafficClaim is a new Kubernetes custom resource we defined that exists solely to narrow and define the scope of resources in a namespace.  Here are some examples:

kind: TrafficClaim
apiVersion: networking.aspenmesh.io/v1alpha3
metadata:
  name: allow-public
  namespace: cluster-public
claims:
# Anything on www.example.com
- hosts: [ "www.example.com" ]

# Only specific paths on foo.com, bar.com
- hosts: [ "foo.com", "bar.com" ]
  ports: [ 80, 443, 8080 ]
  http:
    paths:
      exact: [ "/admin/login" ]
      prefix: [ "/products" ]

# An external service controlled by ServiceEntries
- hosts: [ "my.external.com" ]
  ports: [ 80, 443, 8080, 8443 ]

TrafficClaims are controlled by Kubernetes Role-Based Access Control (RBAC).  Generally, the same roles or people that create namespaces and set up projects would also create TrafficClaims for those namespaces that need power to define service mesh traffic policy outside of their namespace scope.  Rule 1 about local scope above can be explained as "every namespace has an implied TrafficClaim for namespace-local traffic policy", to avoid requiring a boilerplate TrafficClaim.

A pattern we use is to put global config into a namespace like "istio-public" - that's the only place that needs TrafficClaims for things like public DNS names.  Or you might have a couple of namespaces like "istio-public-prod" and "istio-public-dev" or similar. It’s up to you.

Traffic Claim Enforcer does not prevent you from thinking of invalid config, but it does help to limit scope. If I'm trying to understand what happens when traffic goes to my microservice, I no longer have to examine every DestinationRule in the system.  I only have to examine the ones in my namespace, and maybe some others that have special TrafficClaims (and hopefully keep that list small).

Traffic Claim Enforcer also provides an early failure for config problems.  Without it, it is easy to create conflicting DestinationRules even in separate namespaces. This is a problem that Istio-Vet will tell you about, but cannot fix - it doesn't know which one should have priority. If you define TrafficClaims, then Traffic Claim Enforcer can prevent you from configuring it at all.

Hat tip to my colleague, Brian Marshall, who developed the initial public spec for TrafficClaims.  The Istio community is undertaking a great deal of work to scope/manage config aimed at improving system scalability.  We made Traffic Claim Enforcer with a focus on scoping to improve the human config experience as it was a need expressed by several of our users.  We're optimistic that the way Traffic Claim Enforcer helps with human concerns will complement the system scalability side of things.

If you want to give Traffic Claim Enforcer a spin, it's included as part of Aspen Mesh.  By default it doesn't enforce anything, so out-of-the-box it is compatible with Istio. You can turn it on globally or on a namespace-by-namespace basis.

Click play below to check it out in action!

https://youtu.be/47HzynDsD8w