Skip to content

Commit 5ee43a0

Browse files
committed
adding solution of challeneg10
1 parent 38acf14 commit 5ee43a0

File tree

2 files changed

+107
-0
lines changed

2 files changed

+107
-0
lines changed
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
apiVersion: v1
2+
kind: Pod
3+
metadata:
4+
name: my-pod
5+
spec:
6+
containers:
7+
- name: my-container
8+
image: busybox:1.28
9+
command: ['sh', '-c', 'echo The app is running! && sleep 3600']
10+
initContainers:
11+
- name: init-container
12+
image: busybox:1.28
13+
# command: ['sh', '-c', "until nslookup myservice.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for myservice; sleep 2; done"]
14+
command: ["sh","-c"]
15+
args:
16+
- |
17+
echo "hello world"
18+
sleep 5
19+
echo "hello world"
20+
exit 1

challenge10/readme.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
## Challenge 10: Troubleshooting and Debugging the Pipeline 🐛
2+
3+
**Objective:** Demonstrate strong troubleshooting skills by diagnosing and solving common pipeline failures.
4+
5+
This challenge focuses on **Troubleshooting (Kubernetes/Docker/CI)** and **Interview Prep**.
6+
7+
### The Scenario
8+
9+
A developer pushes a change, and the new Deployment fails to update. Your job is to diagnose the common causes for each of these three failure scenarios:
10+
11+
1. **Kubernetes Deployment Failure:** The `web-app-deployment` is stuck in a state where the new Pods are created, but they remain in the **`Init: 0/1`** or **`ImagePullBackOff`** state.
12+
2. **Liveness Probe Failure:** The Pods launch successfully and transition to a **`Running`** state, but the deployment rolls back, and you see repetitive restarts. Looking at the Pod events, you see the `livenessProbe` is consistently failing.
13+
3. **CI Pipeline Failure (GitHub Actions):** The Docker build step in Challenge 8's workflow fails with the error: `denied: requested access to the resource is denied`.
14+
15+
**Your Deliverable:**
16+
17+
For each of the three failure scenarios, provide:
18+
19+
1. The **root cause** (1-2 sentences).
20+
2. The **most critical command(s)** you would execute to confirm or diagnose the issue in a real-world environment.
21+
22+
## Solution
23+
24+
1. **Kubernetes Deployment Failure:**
25+
26+
**`Init: 0/1`**
27+
28+
If the state of my deployment is stuck in a state where new pods are created but they are stuck in **`Init: 0/1`** state it might be due to our init container is waiting for some external service or resource that is unavailable at the moment
29+
30+
Suppose we have following pod config
31+
```yaml
32+
33+
apiVersion: v1
34+
kind: Pod
35+
metadata:
36+
name: my-pod
37+
spec:
38+
containers:
39+
- name: my-container
40+
image: busybox:1.28
41+
command: ['sh', '-c', 'echo The app is running! && sleep 3600']
42+
initContainers:
43+
- name: init-container
44+
image: busybox:1.28
45+
command: ['sh', '-c', "until nslookup myservice.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for myservice; sleep 2; done"]
46+
47+
```
48+
49+
here we have defined a container and an initcontainer but that initcontainer is waiting for another service to get available so that it can complete successfully but since we donot have any service it will keep on waiting and our pod will be in Init: 0/1 state
50+
51+
52+
53+
Note: we may get other types of errors as well
54+
- Init:ErrImagePull : when container is unable to pull the image
55+
- Init:Error : when the init containers script failed to execute successfully
56+
57+
58+
**`ImagePullBackOff`**
59+
60+
If the state of the pod is showing as ImagePullBackOff then it could be because of following 2 reasons
61+
62+
- Wrong Image Name: In the config we might have specified the wrong image name and hence it is unable to pull the image
63+
- Permission issue : The cluster does not have necessary permission to pull the image for example we have n;t logged into docker hub and trying to pull a private repo.
64+
65+
66+
We can use the following commands to inspect the situation
67+
68+
- `kubectl describe pod <pod-name>` - we can use this commnad to check the status, errors and events
69+
70+
- `kubectl logs <pod-name> -c <container-name>` : to check the logs of containers running inside the pods
71+
72+
73+
2. **Liveness Probe Failure:**
74+
75+
It might be due to 2 reasons
76+
77+
- insufficient initialDelaySeconds: Our app might be taking more time to start than the sepcified initalDelaySeconds and liveness probe will be making request before it is ready to process the request causing the request to fail and pod restart
78+
79+
- unrealistic timeoutSeconds: We might have configure the timeoutSeconds too low which so before our request gets processed the liveness probe considers our pod to be unresponsive and restarts it
80+
81+
82+
3. **CI Pipeline Failure (GitHub Actions):**
83+
84+
this error might happen because of any either of the 2 reasons mentioned below
85+
86+
1. we have our image stored in private repo and we have not logged into the docker hub and trying to access the image
87+
2. the dockerhub token which we have generated does not have required permissions to get/write the image from the repo.

0 commit comments

Comments
 (0)