Kubernetes Learning Week Series 12
Understanding Kubernetes APIs: Custom Resource Definition
This article explains the concept of Custom Resource Definitions (CRDs) in Kubernetes, which allow users to extend the Kubernetes API by creating their own custom resources. It covers the process of creating CRDs, including defining schemas and validation rules, as well as building custom controllers to manage these resources. The article also discusses containerizing custom controllers with Docker and setting up the necessary Role-Based Access Control (RBAC) permissions.
Key Points:
CRDs enable you to create your own custom resources in Kubernetes.
When a CRD is created, the Kubernetes API server validates it and generates new RESTful API endpoints for the custom resource.
Custom resources are managed by custom controllers or operators, ensuring the desired state of the resources is maintained.
The article provides an example of creating a CRD named “pdfdoc,” including the YAML manifest and Go code for a custom controller.
The custom controller is containerized with Docker and deployed using a service account with the necessary RBAC permissions.
The article demonstrates how to interact with the custom resource, including creating and modifying instances.
Interview Questions:
What are the key considerations when creating a CRD?
How do you set up the necessary RBAC permissions for a custom controller?
Can you explain the process of creating a custom controller in Go?
What are the benefits of using CRDs in Kubernetes?
How Requests and Limits Work in Kubernetes
https://thenewstack.io/how-kubernetes-requests-and-limits-really-work/
This article delves into how Kubernetes handles resource management, with a particular focus on the inner workings of requests and limits, aiming to uncover the technical details behind this fundamental concept.
Key Points
A user’s raw resource requests and limits are stored in the Pod specification.
Nodes have a static allocatable capacity, which is reported in the node status, regardless of the number of running Pods.
During the scheduling of new Pods, the kube-scheduler considers node capacity, running Pod requests, and pending Pod requests. Limits are ignored in the scheduling process.
Kubernetes calculates a node’s “fullness” based on the sum of Pod requests, not actual resource usage.
Interview Questions
How are CPU requests and limits implemented at the Linux kernel level?
How are memory requests and limits implemented at the Linux kernel level?
How does Kubernetes’ eviction process work when a node is under resource pressure?
Pod Disruption In Kubernetes
This article discusses the concept of Pod Disruption Budgets (PDB) in Kubernetes. A PDB is a Kubernetes resource or policy that helps define the minimum number of Pods that must remain available during disruptions, whether voluntary (e.g., node maintenance) or involuntary (e.g., node failure). PDB provides a way to protect application stability and availability by ensuring a minimum number of Pods are always running to handle user requests.
Key Points
Function of PDB:
A PDB communicates the desired state of the Kubernetes cluster, and Kubernetes will restrict further deletions or disruptions to maintain the minimum number of Pods specified in the PDB.
Benefits of PDB:
Prevent unnecessary downtime.
Ensure smooth upgrades and maintenance.
Enhance cluster reliability.
Improve node management.
Seamless integration with the Cluster Autoscaler.
Main PDB Fields:
.spec.selector: Defines the Pods the PDB applies to.
.spec.minAvailable: Specifies the minimum number of Pods that must be available.
.spec.maxUnavailable: Specifies the maximum number of Pods that can be unavailable.
Example:
The article demonstrates creating a PDB for a simple Mario demo application, setting minAvailable to 4 (out of 8 replicas), and testing the PDB by draining a node.
Key Considerations When Using PDB:
Monitor the status of Pod disruptions.
Avoid using a PDB in single-deployment scenarios.
Set appropriate minAvailable and maxUnavailable values to balance application availability and flexibility.
Interview Questions
How does a PDB help maintain application availability during node maintenance or failure?
What are the main fields of a PDB, and how do they function?
What are the steps outlined in the article for creating and testing a PDB for the Mario demo application?
What important factors should be considered when using PDBs?
How to Resolve Network Latency Jitter Issues Caused by a Large Number of IPVS Rules
This article discusses a network latency issue encountered by users after migrating their application from virtual machines to the Kubernetes platform. The main points are as follows:
Key Points
The application running in containers experienced significantly higher error rates compared to virtual machines, primarily due to network timeout errors.
By using eBPF to trace kernel network functions, the author identified significant delays when packets were sent from the container’s veth interface to the host’s veth interface.
Further investigation using the perf tool revealed high soft interrupt activity on a specific CPU, caused by TIMER soft interrupts, with the estimation_timer() function consuming a considerable amount of time.
The root cause was the presence of a large number of IPVS rules on the node. Each time the estimation_timer() function traversed these rules, it consumed significant time, leading to network timeouts.
Questions of Interest
Did the author use eBPF to trace network functions and identify the root cause of the issue?
What long-term solutions were proposed to address the problem caused by a large number of IPVS rules?
What temporary fix did the author implement in the production environment to work around the issue?
Building a Service Mesh: Admission Controller
https://dev.to/ramonberrutti/build-your-service-mesh-part-2-9c4
This article explains how to create an Admission Controller in Kubernetes for a service mesh. The Admission Controller is responsible for modifying Pods by injecting init containers and sidecar containers before persisting them to etcd. The article includes the code for the Admission Controller, deployment steps, and instructions on how to test it by adding appropriate annotations to the Pod template.
Key Points
The Admission Controller is used to validate and modify objects before they are persisted to etcd.
The mutate function of the Admission Controller processes AdmissionReview objects and returns an AdmissionResponse object with modification patches.
The Admission Controller is deployed using a MutatingWebhookConfiguration, which informs the kube-apiserver to send Pod creation requests to the injector.
To trigger the Admission Controller, the annotation diy-service-mesh/inject: "true" must be added to the Pod template.
Related Interview Questions
What is the role of an Admission Controller in a service mesh?
How does the Admission Controller modify Pods?
How is an Admission Controller deployed?
How can you test an Admission Controller?
How I Reduced EKS Windows Node Startup Time from 5 Minutes to Around 90 Seconds
https://hackernoon.com/how-i-reduced-eks-windows-node-start-time-from-5-min-to-90s
This article discusses how the author reduced the startup time of AWS EKS (Elastic Kubernetes Service) Windows nodes from 5 minutes to around 90 seconds using various optimization techniques.
Key Points
Preloading Base Images:
By using AWS Image Builder, the author significantly reduced node startup time by caching time-consuming IIS downloads.
Enabling Fast Launch:
Enabling the Fast Launch feature on EC2 Windows instances reduced the initial startup time to about 50-60 seconds.
Optimizing Startup Scripts:
The longest part of the node startup process, the Start-EKSBootstrap.ps1 script, was optimized by removing unnecessary dependencies and rewriting it in C# for better performance.
Final Results:
Optimized settings allowed an m5a.xlarge Windows node to start in approximately 93 seconds.
An m5a.2xlarge node started in around 77 seconds.
Further Improvements:
Additional performance gains could be achieved by reducing the time spent in the OOBE (Out-Of-Box Experience) process and optimizing the EKS-StartupTask.ps1 script.