Google Kubernetes Engine (GKE)

GKE is Google’s managed version of Kubernetes. There are a lot of container orchestration engines, but K8s is the most popular.

GKE sits between IaaS and PaaS
GKE manages and maintains logging, health management, and monitoring of clusters
Easy to update the Kubernetes clusters and you can choose which release channel you’d like based on update frequency
Create regional GKE clusters to improve the availability and resilience of your apps. This will distribute GKE control plane components, nodes, and pods across zones in a region.

Container Review

Popularized by Docker initially → write once, run (almost) anywhere
Single package of everything needed to run an app including dependencies (other than external dependencies like a database)
Supports consistency across dev/test/prod environments
Loose coupling between application and OS layers
Simpler to migrate between on-prem and cloud (including other clouds)
Supports agile development and operations
Great for microservices
Cloud Artifact Registry (replacing Container Registry) for Docker containers in a private repo, gcr.io/project-name. It supports both container images and non-container artifacts.

General Kubernetes (K8s) Notes

Helpful Supplemental Materials: Best Practices, Cluster Autoscaler, Networking Overview

Containers run inside of Pods. A pod is a K8s object and is the smallest deployable unit in GKE, a container will always be inside a pod. Many pods will only contain a single container, but some may contain sidecars such as a proxy.
- A pod is a logical application-centric unit for hosting containers
- The containers inside of a pod should be tightly-coupled
You should be creating objects declaratively using specification files in YAML
- Writing declaratively is opposed to imperative models like using Ansible or Chef when you describe each step

A Deployment is a declarative, desired state for Pods and/or ReplicaSets. This is the preferred object for deploying compute workloads in K8s.
- Logic for updating, rolling back, and scaling deployments
- Proportional scaling and error checking for rollouts
- Deployments allow you to perform rolling updates, creating a ReplicaSet of Pods with the new container image. Old Pods are not removed until a sufficient number of new Pods are Running, and new Pods are not created until a sufficient number of old Pods have been removed.
ReplicaSets maintains a stable set of replica pods running at any given time. It is used to guarantee the availability of a specified number of identical pods
StatefulSets are used to manage stateful applications by providing guarantees about the ordering and uniqueness of the pods. Unlike a Deployment, a StatefulSet maintains a persistent identity for each pod.
- Stable network identity and persistent storage, along with ordered graceful deployment, scaling, and updates of pods
  - Persistent disks in StatefulSets are retained even if a Stateful Pod is removed and must be manually deleted
- When Pods in a StatefulSet are being deleted, they are terminated and removed in reverse order
- Used for applications like Elasticsearch or any other application that holds state
DaemonSets ensure that all (or some) nodes run a copy of a pod. Examples include a cluster storage tool, log collection tool, custom drivers for a GPU, or monitoring tool where a workload requires access to a service on every node
- One Pod per Node mode across the cluster or a subset of nodes
- If new nodes are added, new daemon pods are automatically created on them
A Kubernetes CronJob creates Jobs on a repeating schedule
- A Job creates one or more pods to complete a task, the pods will terminate when the job completes successfully
- A parallel job may have a fixed completion count running multiple pods to completion, parallelism is configurable
- Init Containers are not related to Jobs or CronJobs, but they are a part of the containers array in a Pod spec and execute before applications or other containers in a pod.
- Jobs and CronJobs are not to be confused with Cloud Scheduler
Tainting a node tells the scheduler not to deploy pods on it with key:NoSchedule
The kubectl cordon and drain options allow you to safely remove pods from a node and prevent further Pods from being scheduled there. This is also good for maintenance and upgrades
Operators allow you to add automated logic to deployments and extend the Kubernetes API functionality with custom resource definitions (CRDs.)

Automatically Scaling Deployments

Horizontally scale Pods with HorizontalPodAutoscaler
- Auto-scales the number of Pod replicas in a ReplicaSet/Deployment
- CPU and Memory thresholds are observed by GKE to trigger scaling
- Custom, multiple, and external metrics (from Cloud Logging) can be used
Vertically scale Pods with VerticalPodAutoscaler
- Newer feature that recommends or applies CPU and RAM requests, more suited for stateful deployments where horizontal scaling isn’t ideal for the workload
- Cannot work alongside the HorizontalPodAutoscaler
Combine horizontal and vertical with multidimensional Pod autoscaling (currently in beta as of April 2023)
Scale cluster nodes with Node-pool cluster autoscaling
- The Cluster Autoscaler adds new VMs when we need more pods but don’t have the capacity. Underutilized nodes are torn down. You specify a minimum and maximum number of nodes per node pool.
  - gcloud container …. --enable-autoscaling --min-nodes 1 --max-nodes 5
- Works best with Node Pools, can have autoscaling policies per node pool
  - Node Pools are a group of nodes within a cluster that all have the same configuration
  - Node pools should be designed around the specific requirements of workloads, then enable cluster and horizontal pod autoscaling when appropriate
  - Supports preemptible VMs in clusters and node pools

You can pre-plan the consumption of your resources with CPU and Memory requests, some interesting Kubernetes CPU Math, 1 CPU = 1000 millicores, 100m = 1/10 of a CPU. You can also set limits to terminate pods if they exceed resource usage.

Services Overview

Another Kubernetes object that exposes a set of pods to the network. It assigns a fixed IP to your pod replicas, three main types:
- ClusterIP: An internal IP address for your pods, kind of like an internal load balancer
- NodePort: Exposes the service on each Node’s IP at a static port
- LoadBalancer: K8s cloud controller manager creates a GCP Network Load Balancer
Configured with selectors, key-value pairs in object metadata
Selectors search for group of labels, like “app=nginx”
Any pod that matches a selector will become part of that service
Also includes a built-in DNS name

Health Checks Overview

Liveness Probes are checks performed by a kubelet, it can check an HTTP endpoint, TCP socket, or run a command
- Include it in a pod spec YAML
Readiness Probes are similar, but define when a pod is ready to start serving traffic. Traffic won’t be directed until it’s ready. Includes an initial delay.
- Also included in a pod spec YAML
Probes are performed by a handler:
- ExecAction
- TCPSocketAction
- HTTPGetAction – Any response > or = to 200 and < 400 indicates success. Any other code indicates failure.
- initialDelaySeconds, periodSeconds, timeoutSeconds
  - successThreshold, failureThreshold

Accessing External Services

Service Endpoints are services with no selector, maps to an IP or FQDN
- Create a ClusterIP Service with corresponding Endpoint object or
- Create an ExternalName with FQDN
- Endpoints can point to multiple IP addresses
Sidecars can provide a connection to the external service, essentially a proxy

Volumes & Persistent Storage Overview

Reminder: Container storage is ephemeral and goes away when a container dies, kind of like a local SSD on a VM
A PersistentVolume is like a persistent disk on a VM, it is a K8s object that defines a piece of storage in the cluster, configured with a Storage Class, and can be manually or dynamically provisioned
- Access Modes define how a volume may be accessed by multiple containers
  - ReadWriteOnce – A Single node can mount and read/write
  - ReadOnlyMany – Any node can mount, but read-only
  - ReadWriteMany – Not supported by GCP persistent disks!
- If you create a PersistentVolumeClaim with a resource request, GKE will dynamically create a disk and volume using the Storage Class provisioner (which will be a persistent disk, not a GCS bucket)
- You need to consume the volume claim when defining volumes in a Pod spec
Volumes are independent objects and are directories mounted in a container to access files, here are the most common types:
- emptyDir: scratch space that can be shared by multiple containers in the same pod. Deleted forever when a Pod is removed from a Node.
- gcePersistentDisk: A volume type native to GCP, must be created beforehand and can be pre-populated with data and mounted read-only by multiple consumers. Will be unmounted when a Pod is removed.
- PersistentVolumeClaim: Used to mount a PersistentVolume into a Pod, a way to “claim” durable storage, and it requires a matching PersistentVolume object
  - You’ll probably see this most frequently
StorageClass can be updated, GKE default is standard persistent disks, but you can also change from standard to SSD and set up regional availability
Constraints
- All replicas in a Deployment share the same PersistentVolumeClaim, so it must be in ReadOnlyMany if more than one Pod needs access
Before using volumes (and persistent volumes) ask yourself, “Does your data really need to be stored in a disk? Or can you abstract this to GCS and Databases?” For example, don’t mount a disk just to serve images on a website.

ConfigMaps & Secrets Overview

Secrets are objects designed to obfuscate sensitive data and insert it at runtime. They can be consumed as environment variables or volumes. They are encoded, not encrypted. Cloud KMS can encrypt secrets for an added layer of actual security
ConfigMaps objects decouple app configuration from image content. Created from files, directories, or literal values. They can be referenced as environment variables or mounted as a Volume.

Kubernetes Deployment Patterns

Rolling Updates

Rolling updates are the default update strategy
Gradually replace Pods with an updated spec
Control how many additional Pods may be created
Specify threshold for failed pods to determine if it was successful
Use kubectl set image for rolling updates
Use kubectl rollout undo to roll back a deployment
The RollingUpdate strategy allows you to confidently roll out a new version of an application. Defining a threshold for the maximum number of unavailable Pods will stall a rollout if new Pods do not become ready within a certain time, potentially catching any issues with the application update. A surge policy will allow slightly more Pods to be running than normal, so that the new rollout can be attempted without removing all of the existing Pods.
If you use Strategy: Recreate, all existing pods are killed before the new ones are created. This is good if the pod needs to write to a persistent disk using ReadWriteOnce, but there may be some downtime between Pod versions

Canary Deployments

Combines multiple Deployments with a single Service
Deploy updates to a small subset of traffic (distributing fewer replicas for a canary)
Common way to test on production traffic
Can be automated with Spinnaker

Blue-Green (or Red-Black)

Maintain two versions of your application deployment
Switch traffic from blue to green with the Service selector
All traffic immediately sent to new deployment, just flip the selector back if necessary

Helm: The Kubernetes Package Manager

Helm is a standalone tool that packages Kubernetes object manifests and configurations into a Helm chart. It saves a lot of time writing extensive YAML files
Maintains the lifecycle of a deployment to GKE
There are public repos of helm charts for popular software, mostly OSS
Other manifest-management solutions are available

Using Helm

Install the helm tool
Search for software in Artifact Hub (Kind of like Docker Hub)
Add the necessary Helm repository
1. helm repo add bitnami https://charts.bitnami.com/bitnami
Install the Helm chart
1. helm install my-wordpress bitnami/wordpress
Helm applies the templated manifests from the chart to your cluster, can specify optional variables (use the -f values.yaml file to be applied to the underlying Chart)

Advanced Ingress Controls

Ingress is a more customizable way to expose traffic than a cloud load balancer
An Ingress object configures access to services from outside the cluster
Designed for HTTP and HTTPS services (web traffic)
Can provide SSL, name-based, and path-based routing, can also add rewrite rules
Once you have an Ingress defined, you need an Ingress Controller:
Ingress Controllers route traffic to services based on Ingress definitions
- Usually fronted by a Cloud Load Balancer
- Consolidates your routing through a single resource, you don’t want 10 load balancers for a website with 10 services, you want 1 with effective path-based routing
- NGINX is a very common ingress controller

Multi-cluster ingress can be used to replicate application deployments across multiple GKE clusters. kubemci creates a Global Cloud Load Balancer and can direct traffic to regions based on the lowest latency. (Need to install kubemci to use it.)

This can also be used to route to multiple clusters using Anthos, see Part VIII for Anthos notes.

Running a Secure GKE Cluster

Remember high-level cloud security principles including VPC and IAM configuration and other GCP resources. Make sure you’re using trusted container images, secure code, and that you’re running the latest version of images with the latest updates.

Private Clusters are a type of VPC-native cluster that only uses internal IP addresses. Nodes, Pods, and Services in a private cluster require unique subnet IP address ranges.

Binary Authorization is a service on Google Cloud that provides centralized software supply-chain security for applications that run on Google Kubernetes Engine (GKE) and Anthos clusters on VMware. It ensures that only signed and authorized images are deployed in your environments. It supports signature-based verification and also allows listing images using name patterns from a repo, path, or set of images.

Role-Based Access Control (RBAC)

Granular method of regulating object access to cluster resources that can be applied at a namespace or cluster level. It grants a set of actions to specific API groups and resources and helps with applying least privilege principles. Can be used with Pod service accounts. There are Roles and ClusterRoles along with RoleBindings and ClusterRoleBindings.

Namespaces & Resource Restrictions

Virtual clusters used to isolate resources for multiple teams or projects. Can divide cluster resources with resource quotas. Default and kube-system are set up automatically on new clusters. Namespaces are a scope for resource names, so object names need to be unique in a namespace but not in a cluster (that’s why the namespace is in DNS names)

Pod Security Policies

Cluster-level resource that controls security-sensitive aspects of a Pod spec, but are deprecated so you probably don’t need to study this. They were confusing for K8s devs so they’d be confusing for anyone trying to pass the PCA exam.

Network Policies

These are like firewalls for pods. Network Policies are objects that define ingress and egress rules for Pods using selectors (and port numbers/protocols) and restrict their incoming and outgoing traffic. They can be used to isolate traffic between namespaces and can get very granular. This setting should be enabled when the cluster is built.

Workload Identities

The recommended way to access GCP services from applications running within GKE because it is more secure and easier to manage. Remember, by default, VMs in GCP use service accounts. Workload identities allow you to map custom GCP service accounts to specific workloads in GKE. Kubernetes native service accounts can be mapped to GCP service accounts with the same name!

Service Mesh Overview (Also important for Anthos… which is covered in the next post.)

Service meshes allow you to confidently operate microservices at scale. They allow you to manage traffic, maintain the reliability and visibility of services, along with dependency management

Data Plane – Sidecar container running a proxy (Envoy in Istio). It controls network traffic in and out of the Pod. The data plane communicates with a control plane to receive routing logic and send metrics
Control Plane – In Istio, there are 3 primary components. Istio can be enabled as part of a GKE installation or can be installed via Helm into the istio-system namespace.
- Pilot: Configures the data plane, defines proxy rules and behavior
- Mixer: Collects traffic metrics and responds to authorization, access control, or quota checks
- Citadel: Assigns TLS certificates to each service and enables end-to-end encryption
Makes traffic management easier (good for blue/green deployments), security between pods, collecting telemetry, visualization in topology graphs. Good for SLA teams who need to solve problems with distributed tracing
Improves security by including a managed private certificate authority (Mesh CA) for issuing mTLS certs. Review full security overview.

Traffic Director is GCP’s fully managed traffic control plane for service mesh. Uses Envoy Proxy under the hood. Works for TCP and HTTP traffic now. For details, see this video. It works at a more global level, offering Envoy proxies for Virtual Machines instead of just pods.

Google Cloud: Professional Cloud Architect (PCA) Exam Notes – Part VII

Google Kubernetes Engine (GKE)

Container Review

General Kubernetes (K8s) Notes

Automatically Scaling Deployments

Services Overview

Health Checks Overview

Accessing External Services

Volumes & Persistent Storage Overview

ConfigMaps & Secrets Overview

Kubernetes Deployment Patterns

Rolling Updates

Canary Deployments

Blue-Green (or Red-Black)

Helm: The Kubernetes Package Manager

Using Helm

Advanced Ingress Controls

Running a Secure GKE Cluster

Role-Based Access Control (RBAC)

Namespaces & Resource Restrictions

Pod Security Policies

Network Policies

Workload Identities

Service Mesh Overview (Also important for Anthos… which is covered in the next post.)

Sebastian Hooker

Google Kubernetes Engine (GKE)

Container Review

General Kubernetes (K8s) Notes

Automatically Scaling Deployments

Services Overview

Health Checks Overview

Accessing External Services

Volumes & Persistent Storage Overview

ConfigMaps & Secrets Overview

Kubernetes Deployment Patterns

Rolling Updates

Canary Deployments

Blue-Green (or Red-Black)

Helm: The Kubernetes Package Manager

Using Helm

Advanced Ingress Controls

Running a Secure GKE Cluster

Role-Based Access Control (RBAC)

Namespaces & Resource Restrictions

Pod Security Policies

Network Policies

Workload Identities

Service Mesh Overview (Also important for Anthos… which is covered in the next post.)

Sebastian Hooker

Related posts