Skip to content

Commit 34971fc

Browse files
authored
번역 workflow 자동화 (#6)
* 번역 workflow 자동화 * chore: add English translations for PR #6 * 번역 workflow 자동화 * 번역 workflow 자동화 * chore: add English translations for PR #6
1 parent 6a06847 commit 34971fc

File tree

3 files changed

+289
-3
lines changed

3 files changed

+289
-3
lines changed

_posts/2025-07-03-actions-runner-controller.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -285,3 +285,4 @@ docker run -it \
285285
ARC를 사용하면 GitHub에서 제공하는 Runner를 사용할 때의 비싼 비용 문제와, 직접 VM을 관리하며 Runner를 운영할 때의 비효율성을 모두 해결할 수 있습니다. 특히 GPU가 필요하거나, 복잡한 의존성을 가진 MLOps CI/CD 환경을 구축할 때 ARC는 매우 강력한 도구가 됩니다.
286286

287287
초기 설정 과정이 다소 복잡하게 느껴질 수 있지만, 한번 구축해두면 CI/CD 비용을 크게 절감하고 운영 부담을 덜어주므로 MLOps를 고민하고 있다면 꼭 한번 도입을 검토해보시길 바랍니다.
288+
Lines changed: 273 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,273 @@
1+
---
2+
feature-img: assets/img/2025-07-03/0.png
3+
layout: post
4+
subtitle: Building an MLOps CI environment
5+
tags:
6+
- MLOps
7+
- Infra
8+
title: Setting Up Actions Runner Controller
9+
---
10+
11+
12+
### Intro
13+
14+
As I’ve been enjoying AI-driven development lately, the importance of a solid test environment has really hit home.
15+
16+
A common approach is to build CI with GitHub Actions, but in MLOps you often need high-spec instances for CI.
17+
18+
GitHub Actions does offer [GPU instances (Linux, 4 cores)](https://docs.github.com/ko/billing/managing-billing-for-your-products/about-billing-for-github-actions), but at the time of writing they cost $0.07 per minute, which is quite expensive.
19+
20+
They’re also Nvidia T4 GPUs, which can be limiting performance-wise as models keep growing.
21+
22+
A good alternative in this situation is a self-hosted runner.
23+
24+
As the name suggests, you set up the runner yourself and execute GitHub workflows on that runner.
25+
26+
You can configure it via GitHub’s [Add self-hosted runners](https://docs.github.com/ko/actions/how-tos/hosting-your-own-runners/managing-self-hosted-runners/adding-self-hosted-runners).
27+
28+
However, this approach requires the CI machine to always be on (online), which can be inefficient if CI/CD jobs are infrequent.
29+
30+
That’s where the Actions Runner Controller (ARC) shines as an excellent alternative.
31+
32+
[Actions Runner Controller](https://github.com/actions/actions-runner-controller) is an open-source controller that manages GitHub Actions runners in a Kubernetes environment.
33+
34+
With it, you can run CI on your own Kubernetes resources only when a GitHub Actions workflow is actually executed.
35+
36+
37+
### Install Actions Runner Controller
38+
39+
Installing ARC has two main steps:
40+
1. Create a GitHub Personal Access Token for communication and authentication with GitHub
41+
2. Install ARC via Helm and authenticate with the token you created
42+
43+
#### 1. Create a GitHub Personal Access Token
44+
45+
ARC needs to authenticate to the GitHub API to register and manage runners. Create a GitHub Personal Access Token (PAT) for this.
46+
47+
- Path: Settings > Developer settings > Personal access tokens > Tokens (classic) > Generate new token
48+
49+
When creating the token, choose the [appropriate permissions](https://github.com/actions/actions-runner-controller/blob/master/docs/authenticating-to-the-github-api.md#deploying-using-pat-authentication). (For convenience here, grant full permissions.)
50+
51+
> For security, use least privilege and set an expiration date.
52+
53+
It appears that authenticating via a GitHub App is recommended over using a PAT.
54+
55+
Keep the PAT safe—you’ll need it to install ARC in the next step.
56+
57+
#### 2. Install ARC with Helm
58+
59+
ARC requires cert-manager. If cert-manager isn’t set up in your cluster, install it:
60+
61+
```bash
62+
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.2/cert-manager.yaml
63+
```
64+
65+
Now install ARC into your Kubernetes cluster with Helm.
66+
67+
Use the Personal Access Token you created earlier to install ARC. Replace YOUR_GITHUB_TOKEN below with your PAT value.
68+
69+
```bash
70+
helm repo add actions-runner-controller https://actions-runner-controller.github.io/actions-runner-controller
71+
72+
helm repo update
73+
74+
helm pull actions-runner-controller/actions-runner-controller
75+
76+
tar -zxvf actions-runner-controller-*.tgz
77+
78+
export GITHUB_TOKEN=YOUR_GITHUB_TOKEN
79+
80+
helm upgrade --install actions-runner-controller ./actions-runner-controller \
81+
--namespace actions-runner-system \
82+
--create-namespace \
83+
--set authSecret.create=true \
84+
--set authSecret.github_token="${GITHUB_TOKEN}"
85+
```
86+
87+
After installation, verify the ARC controller is running:
88+
89+
```bash
90+
kubectl get pods -n actions-runner-system
91+
```
92+
93+
If the command succeeds, you should see the ARC controller manager pod running in the actions-runner-system namespace.
94+
95+
ARC is now ready to talk to GitHub. Next, define the runner that will actually execute your workflows.
96+
97+
### 3. Configure a Runner
98+
99+
The ARC controller is installed, but there’s no runner yet to execute workflows. You need to create runner pods based on GitHub Actions jobs.
100+
101+
You’ll use two resources:
102+
1. RunnerDeployment: Acts as a template for runner pods. Defines the container image, target GitHub repository, labels, etc.
103+
2. HorizontalRunnerAutoscaler (HRA): Watches the RunnerDeployment and automatically adjusts its replicas based on the number of queued jobs in GitHub.
104+
105+
#### Define RunnerDeployment
106+
107+
Create a file named runner-deployment.yml as below. Change spec.template.spec.repository to your own GitHub repo.
108+
109+
> If you have permissions, you can also target an organization instead of a single repository.
110+
111+
```yaml
112+
apiVersion: actions.summerwind.dev/v1alpha1
113+
kind: RunnerDeployment
114+
metadata:
115+
name: example-runner-deployment
116+
namespace: actions-runner-system
117+
spec:
118+
replicas: 1
119+
template:
120+
spec:
121+
repository: <YOUR_NAME>/<YOUR_REPO_NAME>
122+
labels:
123+
- self-hosted
124+
- arc-runner
125+
```
126+
127+
With this configured, you can check the self-hosted runner in your GitHub repo’s Actions settings.
128+
129+
<img src="/assets/img/2025-07-03/1.png">
130+
131+
Once the deployment is up, after a short while you’ll see a new runner with labels self-hosted and arc-runner under Settings > Actions > Runners in your repository.
132+
133+
#### Define HorizontalRunnerAutoscaler
134+
135+
Next, define an HRA to autoscale the RunnerDeployment you just created. Create hra.yml:
136+
137+
```yaml
138+
apiVersion: actions.summerwind.dev/v1alpha1
139+
kind: HorizontalRunnerAutoscaler
140+
metadata:
141+
name: example-hra
142+
namespace: actions-runner-system
143+
spec:
144+
scaleTargetRef:
145+
name: example-runner-deployment
146+
minReplicas: 0
147+
maxReplicas: 5
148+
```
149+
150+
By setting minReplicas and maxReplicas, you can scale up and down based on available resources.
151+
152+
You can also configure additional metrics to create pods whenever there’s a workflow trigger. Many other metrics are supported.
153+
154+
> When using HorizontalRunnerAutoscaler, runners are created only when needed. During idle periods (when there are zero runners), you won’t see any runners in the GitHub UI.
155+
156+
<img src="/assets/img/2025-07-03/2.png">
157+
158+
```yaml
159+
apiVersion: actions.summerwind.dev/v1alpha1
160+
kind: HorizontalRunnerAutoscaler
161+
metadata:
162+
name: example-hra
163+
namespace: actions-runner-system
164+
spec:
165+
scaleTargetRef:
166+
name: example-runner-deployment
167+
minReplicas: 0
168+
maxReplicas: 5
169+
metrics:
170+
- type: TotalNumberOfQueuedAndInProgressWorkflowRuns
171+
repositoryNames: ["<YOUR_NAME>/<YOUR_REPO_NAME>"]
172+
```
173+
174+
The above is my preferred metric—it scales up when workflows are queued. As shown, you can choose metrics to fit your needs and get great results.
175+
176+
177+
### 4. Use it in a GitHub Actions workflow
178+
179+
All set! Using the new ARC runner is simple: specify the labels you set in the RunnerDeployment under runs-on in your workflow.
180+
181+
Add a simple test workflow (test-arc.yml) under .github/workflows/ in your repo:
182+
183+
```yaml
184+
name: ARC Runner Test
185+
186+
on:
187+
push:
188+
branches:
189+
- main
190+
191+
jobs:
192+
test-job:
193+
runs-on: [self-hosted, arc-runner]
194+
steps:
195+
- name: Checkout code
196+
uses: actions/checkout@v3
197+
198+
- name: Test
199+
run: |
200+
echo "Hello from an ARC runner!"
201+
echo "This runner is running inside a Kubernetes pod."
202+
sleep 10
203+
```
204+
205+
The key part is runs-on: [self-hosted, arc-runner]. When this workflow runs, GitHub assigns the job to a runner that has both labels. ARC detects this event and, per your HRA settings, creates a new runner pod if needed to process the job.
206+
207+
> With self-hosted runners, unlike GitHub-hosted runners, you may need to install some packages within your workflow.
208+
209+
### Troubleshooting notes
210+
211+
For CI/CD, I often use Docker, and one recurring issue is Docker-in-Docker (DinD).
212+
213+
With ARC, by default the runner (scheduling) container and a docker daemon container run as sidecars.
214+
215+
To handle this, there’s a Docker image that supports DinD.
216+
217+
If you specify the image and dockerdWithinRunnerContainer as below, the Docker daemon runs inside the runner, and the workflow runs on that runner.
218+
219+
```yaml
220+
apiVersion: actions.summerwind.dev/v1alpha1
221+
kind: RunnerDeployment
222+
metadata:
223+
name: example-runner-deployment
224+
namespace: actions-runner-system
225+
spec:
226+
replicas: 1
227+
template:
228+
spec:
229+
repository: <YOUR_NAME>/<YOUR_REPO_NAME>
230+
labels:
231+
- self-hosted
232+
- arc-runner
233+
image: "summerwind/actions-runner-dind:latest"
234+
dockerdWithinRunnerContainer: true
235+
```
236+
237+
For Docker tests that need GPUs, if your cluster has NVIDIA Container Toolkit installed, using the DinD image above allows the GPU to be recognized.
238+
239+
Configure your workflow like this to confirm GPUs work even in a DinD setup. (Make sure your NVIDIA Container Toolkit and NVIDIA GPU Driver Plugin versions are compatible!)
240+
241+
```bash
242+
# Check GPU devices
243+
ls -la /dev/nvidia*
244+
245+
# Device/library setup
246+
smi_path=$(find / -name "nvidia-smi" 2>/dev/null | head -n 1)
247+
lib_path=$(find / -name "libnvidia-ml.so" 2>/dev/null | head -n 1)
248+
lib_dir=$(dirname "$lib_path")
249+
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(dirname "$lib_path")
250+
export NVIDIA_VISIBLE_DEVICES=all
251+
export NVIDIA_DRIVER_CAPABILITIES=compute,utility
252+
253+
# Mount GPU devices and libraries directly without the nvidia runtime
254+
docker run -it \
255+
--device=/dev/nvidia0:/dev/nvidia0 \
256+
--device=/dev/nvidiactl:/dev/nvidiactl \
257+
--device=/dev/nvidia-uvm:/dev/nvidia-uvm \
258+
--device=/dev/nvidia-uvm-tools:/dev/nvidia-uvm-tools \
259+
-v "$lib_dir:$lib_dir:ro" \
260+
-v "$(dirname $smi_path):$(dirname $smi_path):ro" \
261+
-e LD_LIBRARY_PATH="$LD_LIBRARY_PATH" \
262+
-e NVIDIA_VISIBLE_DEVICES="$NVIDIA_VISIBLE_DEVICES" \
263+
-e NVIDIA_DRIVER_CAPABILITIES="$NVIDIA_DRIVER_CAPABILITIES" \
264+
pytorch/pytorch:2.6.0-cuda12.4-cudnn9-runtime
265+
```
266+
267+
### Wrapping up
268+
269+
We covered how to build a dynamically scalable self-hosted runner environment by deploying Actions Runner Controller in Kubernetes.
270+
271+
Using ARC solves both the high cost of GitHub-hosted runners and the inefficiency of managing your own VMs for runners. ARC is especially powerful when you need GPUs or have complex dependencies in an MLOps CI/CD setup.
272+
273+
The initial setup can feel a bit involved, but once in place, it can significantly cut CI/CD costs and reduce operational burden. If you’re working on MLOps, it’s well worth considering.

scripts/translate_to_en.py

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010

1111
try:
1212
from openai import OpenAI
13-
except Exception:
13+
except ImportError:
1414
print("[ERROR] openai package not available. Make sure it's installed.")
1515
sys.exit(1)
1616

@@ -105,7 +105,7 @@ def split_front_matter(translated_markdown: str) -> tuple[dict, str]:
105105
fm = yaml.safe_load(fm_text) or {}
106106
if not isinstance(fm, dict):
107107
fm = {}
108-
except Exception:
108+
except yaml.YAMLError:
109109
fm = {}
110110
return fm, body
111111

@@ -154,10 +154,22 @@ def main() -> int:
154154

155155
# Construct English filename by preserving the original filename
156156
en_path = to_en_filename(src)
157+
158+
# If the English file already exists, remove it first to ensure a clean overwrite
159+
if en_path.exists():
160+
try:
161+
en_path.unlink()
162+
print(
163+
f"[translate] Overwriting existing: {en_path.relative_to(REPO_ROOT)}"
164+
)
165+
except OSError:
166+
# Best-effort unlink; continue with write which will overwrite contents
167+
pass
168+
157169
en_post = frontmatter.Post(body, **fm)
158170
with open(en_path, "w", encoding="utf-8") as f:
159171
f.write(frontmatter.dumps(en_post))
160-
print(f"[translate] Created: {en_path.relative_to(REPO_ROOT)}")
172+
print(f"[translate] Created/Updated: {en_path.relative_to(REPO_ROOT)}")
161173
created += 1
162174

163175
print(f"[translate] New English posts: {created}")

0 commit comments

Comments
 (0)