Using Istio routing rules to reduce data sharing risks

The GOV.UK Verify Proxy Node is a component of Government Digital Service (GDS) integration with the eIDAS framework. The Proxy Node is deployed on the GDS Supported Platform (GSP), a Kubernetes-based platform that provides standard systems and tooling for building, deploying and monitoring GDS applications.

Choosing an open source tool 

We knew regulatory requirements and the sensitive nature of the data shared between GDS and the eIDAS framework meant that security was crucial. We had several sessions with the GDS Information Assurance team, the project team and colleagues from the National Cyber Security Centre. 

We decided to use open source Istio to mitigate some of the risks. We chose Istio because it provides:

layer 7 routing rules that work with the layer 4 network policy resources native to Kubernetes mutual TLS, both internal and external HTTP request end-to-end tracing across multiple microservices egress traffic control

There are also several other aspects of Istio that made it an attractive choice for the team, including:

traffic shaping, to support canary-style deployments and A/B testing  service to service authorisation rules Securing the Proxy Node

The GSP is based on AWS Elastic Kubernetes Service (EKS). To provide security for the Proxy Node and its data we used an AWS CloudHSM, with strict controls over which components were able to connect to this external resource. There are 2 components, residing in different namespaces, that connect to the CloudHSM over a TCP connection on ports 2223-5.

We were using EKS version 1.12 and Istio version 1.1.4.

We installed Istio using a Helm Chart, a programme that helps manage Kubernetes applications. 

These are the relevant parts of the values.yaml:

global:  k8sIngressSelector: ingressgateway  mtls:    enabled: true  proxy:    accessLogFile: "/dev/stdout" istio_cni:  enabled: true gateways:  istio-egressgateway:    enabled: true    ports:    - port: 80      name: http2    - port: 443      name: https      # This is the port where sni routing happens    - port: 15443      targetPort: 15443      name: tls    - port: 2223      name: tcp-cloudhsm-2223 sidecarInjectorWebhook:  enableNamespacesByDefault: true   rewriteAppHTTPProbe: true

We followed the example given in Istio’s documentation to remove arbitrary egress directly, forcing all egress traffic (out of the cluster) to route via something in istio-system:

apiVersion: kind: NetworkPolicy metadata:  name: allow-egress-to-istio-system-and-kube-dns  namespace: proxy-node spec:  podSelector: {}  policyTypes:  - Egress  egress:  - to:    - namespaceSelector:        matchLabels:          kube-system: "true"    ports:    - protocol: UDP      port: 53  - to:    - namespaceSelector:        matchLabels:          istio: system  - to:    - namespaceSelector:        matchLabels:           namespace: proxy-node

If you’re doing something similar, you’ll need to add the kube-system and istio labels referenced here as they are not present in a default install. 

We also added a to block, which was not included in the Istio documentation example, to allow all egress within the namespace. Most people find it easier to reason about traffic rules on an ingress basis so this block allows that by re-enabling all local egress.

At this point of the project, we’d: 

disabled arbitrary egress from all but the istio-system and kube-system namespaces - we plan on limiting arbitrary egress from these namespaces in the future enabled mutual TLS globally within the Istio service mesh added an injected Istio sidecar to every pod in the affected namespaces

The next task was to use Istio’s traffic management resources to allow the 2 components to communicate with the CloudHSM. We used a blog post on using external services to help with this.

However, that and most of the examples in Istio documentation only covered a single endpoint from a single application in a single namespace. Our need spanned several namespaces. 

Figuring out which of the resources documented in the examples needed to be placed in which namespaces was the piece missing from the existing documentation.

As the CloudHSM is outside the service mesh, we needed to add a ServiceEntry. Some of the configuration goes into a “common” namespace, which in this case is istio-system, while other parts of the configuration is applied to the application-specific namespaces.

apiVersion: kind: ServiceEntry metadata:  name: cloudhsm-2223  namespace: istio-system spec:  hosts:  - cloudhsm-2223.tcp.svc  addresses:  -  ports:  - name: tcp-2223    number: 2223    protocol: TCP  location: MESH_EXTERNAL  resolution: STATIC  endpoints:   - address:

We did not use the hosts entry as a fully qualified domain name in the DNS sense, as it connects to the CloudHSM via IPv4 address. However, we used it to tie together the various resources for this particular route.

You’ll see the “-2223” suffix in the code example. This is because the CloudHSM connectivity spans ports 2223-5 and these all need repeating for 2224 and 2225. The code examples here will only show 2223.

We needed to redirect the traffic leaving the pod to the istio-egressgateway in the istio-system namespace. That required a VirtualService. We needed to add this to each namespace that wants to make use of this connection.

apiVersion: kind: VirtualService metadata:  name: proxy-node-cloudhsm-2223-egress  namespace: proxy-node spec:  hosts:  - cloudhsm-2223.tcp.svc  gateways:  - mesh  tcp:  - match:    - gateways:      - mesh      destinationSubnets:      -      port: 2223      sourceLabels:        talksToHsm: "true"    route:    - destination:        host: istio-egressgateway.istio-system.svc.cluster.local        subset: proxy-node-cloudhsm-2223-egress        port:          number: 443  exportTo:   - "."

The use of sourceLabels limits which pods are allowed to use this route. Similarly, exportTo limits the exposure of the route definition to the current namespace to prevent an overlap.

We added a redirection in port from 2223 to 443. This is because the connection with the istio-egressgateway is secured with mutual TLS. Traffic from this pod is tagged as being part of an Istio subset, the behaviour of which is governed by a DestinationRule that handles connectivity with the istio-egressgateway.

apiVersion: kind: DestinationRule metadata:  name: proxy-node-egressgateway-for-cloudhsm-2223  namespace: proxy-node spec:  host: istio-egressgateway.istio-system.svc.cluster.local  subsets:  - name: proxy-node-cloudhsm-2223-egress    trafficPolicy:      portLevelSettings:      - port:          number: 443        tls:          mode: ISTIO_MUTUAL          sni: cloudhsm-2223.tcp.svc  exportTo:  - "."

The istio-egressgateway needs to be configured to listen for the incoming connection. We did this with a Gateway resource.

apiVersion: kind: Gateway metadata:  name: cloudhsm-egress-2223  namespace: istio-system spec:  selector:    istio: egressgateway  servers:  - port:      number: 443      name: tls-cloudhsm-2223      protocol: TLS    hosts:    - cloudhsm-2223.tcp.svc    tls:      mode: MUTUAL      serverCertificate: /etc/certs/cert-chain.pem      privateKey: /etc/certs/key.pem       caCertificates: /etc/certs/root-cert.pem

The locations of the certificates and keys are important. These are provided by Istio (citadel, specifically) and mounted automatically.

At this point, the istio-egressgateway needed a route definition for what to do with the CloudHSM-bound traffic that reaches it. We used a VirtualService for this.

apiVersion: kind: VirtualService metadata:  name: cloudhsm-egress-2223  namespace: istio-sytem spec:  hosts:  - cloudhsm-2223.tcp.svc  gateways:  - cloudhsm-egress-2223  tcp:  - match:    - gateways:      - cloudhsm-egress-2223      port: 443    route:    - destination:        host: cloudhsm-2223.tcp.svc        port:          number: 2223       weight: 100

The final piece in the puzzle is the DestinationRule to handle the final segment.

apiVersion: kind: DestinationRule metadata:  name: proxy-node-cloudhsm-2223-egress  namespace: proxy-node spec:  host: cloudhsm-2223.tcp.svc  exportTo:   - "."

We did not need this DestinationRule when routing an HTTP, HTTPS or TLS service via the istio-egressgateway. It only appears to be necessary when routing TCP services. It’s also not clear why this resource cannot live in the common namespace.

At this point, a pod in the verify-proxy namespace that carries the label talksToHsm: "true" will be able to establish a TCP connection, via mutual TLS with the istio-egressgateway, to the CloudHSM.

You can also use the examples above to cover traffic over HTTP and HTTPS (or via TLS using Server Name Indication). At this point, you can switch back to the Istio documentation for the rest of the instructions. What we’ve shown here is how and where to split the various resources across the namespaces to ensure connectivity.

If you’re doing something similar and have found our experience helpful, we’d like to hear from you. You can get in touch by leaving a comment below.

seen at 16:30, 31 July in Technology in government.
Email this to a friend.