GitHub Actions Self Hosted Runner (Autoscaling with Kubernetes) - read the full article about Kubernetes tutorial, IT Consulting and from Anton Putra on Qualified.One
There are two types of runners available to you with GitHub Actions.
GitHub-hosted runners are virtual machines hosted by GitHub with the GitHub Actions runner application installed.
They come with a free tier; if you reach a limit, you will pay a per-minute rate for each runner.
Also, GitHub provides you with options to host your own runners wherever you want.
It may be on-premise, cloud, or even Kubernetes containers.
They are free to use with GitHub Actions, but you are responsible for the cost of maintaining your infrastructure.
Many people already have Kubernetes clusters, so its very convenient to run those self-hosted runners in Kubernetes.
They dont require exposing anything to the internet if you dont want.
GitHub self-hosted runners use long http poll.
For autoscaling, you can optionally choose to use webhooks; in that case, you need to expose actions Kubernetes controller to the internet.
Since many companies are not going to allow that, for this video, I use a long poll that can be configured to autoscalle self-hosted runners in Kubernetes.
For autoscaling, you dont need to install a metrics server or Prometheus.
Metrics pulled directly from the GitHub API.
First of all, lets create a repository where we will be testing our github actions workflows.
Lets call it lesson-089 and make it private.
Its a general recommendation from GitHub; if you want to use self-hosted runners, your repository should be private.
This is because forks of your repository can potentially run dangerous code on your self-hosted runner machine by creating a pull request that executes the code in a workflow.
For now, lets leave it alone; we will come back to it later after we create a Kubernetes cluster and deploy a couple of components.
You can use your existing Kubernetes cluster to deploy self-hosted runners.
For this video, I prepared terraform code to create VPC, a few students, and EKS cluster, which I named demo.
All the code and links are available in description.
Let me quickly go over those terraform files.
First is the provider; I already created an admin user in AWS and configured default credentials to use with this provider.
Then we have VPC.
An Internet gateway to provide internet access for our services.
Then I have four subnets in two different availability zones to meet EKS networking requirements.
Two private to host Kubernetes nodes and two public to provision load balancers to expose services outside of Kubernetes and VPC.
NAT gateway to provide internet access for services in private subnets.
A couple of routes, public with a default route to internet gateway and private with a default route to NAT gateway.
EKS cluster with IAM roles.
And a managed instance group.
Lets change the directory to terraform and run init to download providers and initialize the local backend.
Lets run terraform apply to create all of those resources in AWS.
Confirm that you want to create 22 objects and wait a few minutes till terraform provision your infrastructure.
When its done, we still need to configure the kubectl context to communicate with our new cluster.
If you get NoneType object error, just remove kube config and try again.
And a standard test to check connection kubectl get svc should return Kubernetes api service.
The first component that we need to install is the cert-manager.
Actions-runner-controller uses cert-manager for certificate management of Admission Webhook.
There are a couple of ways to install a cert-manager.
One of them is to use the Helm chart.
Lets add a cert-manager helm repository.
Then update the helm chart index.
And search for a cert-manager chart to check the available versions to install.
The current one is v1.6.0.
As always, I suggest using the same version and upgrading after you successfully install all the components.
Lets use the helm install command and provide a few options, including version, disable Prometheus, and enable CRDs.
Lets get pods in the cert-manager namespace to verify the installation.
Alright, we have all three up.
Now we need to set up authentication with GitHub API.
There are two ways for actions-runner-controller to authenticate with the GitHub API Using a GitHub App, Using a PAT.
Functionality-wise, there isnt much of a difference between the two authentication methods.
The primary benefit of authenticating via a GitHub App is an increased API quota.
Lets go with GitHub App Authentication.
First, we need to create that GitHub App.
You can either create an app on your personal account, or if you want to use self-hosted runers across all your organization repos, you need to create a github app on the organization level.
For this demo, lets do only for my account.
Go to setting and developer settings.
Then click new GitHub app.
Give it a name; keep in mind that it has to be globally unique.
You can paste your organization hostname here; Ill use my personal website.
Lets disable webhooks.
If you want to use webhooks, you must expose your actions controller outside to the internet; not many companies will allow you to do so.
By default, for autoscaling, we will be using a long http poll.
In that way, we dont need to expose anything to the internet.
Webhook is usually helpful to autoscale quickly, but we still will be able to do that but with a small lag.
On the repository level, you need read access to Actions and read-write to Administration.
Metadata is checked by default.
Next step, we need to generate a private key for our app.
Lets also install it right away.
I will select only the new repository that I created for this tutorial lesson-089.
Now we need to create actions namespace where we are going to host our runners.
If you switch to download folder, you will find GitHub App private key.
We will use it in a minute to create a Kubernetes secret.
Lets provide a path to the private key.
Then we need an installation id that we can grab from the URL.
And finally, the app id, you will find on the GitHub App home page.
Alright, a secret is created.
Now we can deploy the actions controller using the helm chart.
Add helm repo first.
Same thing here; you need to update the index.
We are going to be using the 0.14 version of this helm chart.
The only important parameter here is a sync period.
Since we will be using poll method to autoscalle our runners, the longer is sync period, the more lag you will get.
The default value is 10 minutes.
You may run in API rate limit issues depending on the size of your environment and how aggressive your sync period configuration is.
Lets check if the controller is up.
Now lets create our first self-hosted runner.
We will be using the custom resource definition that comes with this operator.
Lets create a runner.yaml file.
Kind is runner.
Give it a name; this name will show up in the GitHub UI.
Also, dont forget to specify the namespace where you want to deploy it.
You can define your self-hosted runner on an organization level, or you can bind it to a particular repository as we do here.
The last argument is an array of empty environment variables; its here just for your reference in case you need to declare them later.
Lets use kubectl to apply it.
For the first time, I would recommend checking logs of that runner to make sure that it was successfully registered with GitHub API.
First, get pods and then check the logs.
Its successfully configured and connected.
Now we can go to GitHub actions tab; this self-hosted runner should show up there as well.
Here it is, k8s-single-runner.
All of the self-hosted runners will have a self-hosted label that we can use in the workflow to differentiate between different machines or, in our case, different containers.
In the beginning, we created an empty repository to test github workflow; its time to clone it and create a simple job to run on that self-hosted runner.
Our workflow must be located under github workflows.
Give it a name Example 1.
Then just run this workflow on each push to the main branch.
Here we specify that we want to run this job on a self-hosted runner.
The first step is to check out the code, which is really unnecessary here.
Then just print something to the console.
Alright, lets commit everything and push it to the github.
It should immediately detect the new workflow and start running.
You can see that our workflow was triggered by the push to the main branch.
This is our build job that will run in the container.
By the way, after the job is finished container will be destroyed as well, so for any new runs, you will get a new clean container.
We have hello world from the self-hosted runner.
If you list all the pods ring now, you will see that a new container was created for the new jobs.
The second custom resource definition is runner deployment.
Now we can manage multiple containers within one group.
Lets call it k8s-runners and put it in the same actions namespace.
Same here, specify either organization or a single repository.
Optionally, you can give your self-hosted runners labels and use them in your workflow.
As with any deployment, if you wont specify the replica count, it will create one pod.
If you dont want to use autoscaling, just add replica with the number of workers you want.
I will improve and add autoscaling in the following example.
Lets go to the terminal and apply that runner deployment.
You have a new pod created a few seconds ago.
Now lets disable our first workflow by adding bak at the end.
Same here, trigger this workflow on each push to the main branch.
But in this case, I want to show you that we can run docker in docker and build images in the Kubernetes cluster.
Maybe lets add a similar job.
Now lets create a very simple Dockerfile and install something in it, for example, nginx.
Again, lets add everything to the git tree, commit and push to remote.
Alright, its building.
We have two jobs and only one suitable runner with the my-custom-runner label.
Its going to build those jobs one by one.
The main point here, that we can build docker images and use custom labels for runners.
Lets disable this workflow as well.
And create example-3.
Its going to be identical to example two; just copy-paste the same build job multiple times.
In this example, we want to autoscalle our runners based on the load.
Lets add five jobs; we just need to update the job keyword.
Next lets create horizonntal runner autoscaller.
It will target our runner-deployment.
Also, we can specify after what time we can scale down the pods; the default value is ten minutes; here, we set 5 minutes.
Specify min and max replica.
Finally, choose the metric type that you want to use for autoscaling.
I will use the total number of queued and in-progress workflow runs.
This metric comes from GitHub API, so you dont need to install anything extra such as metrics servicer or Prometheus.
And specify the repository associated with that metric.
Lets apply it now.
Now lets me run watch kubectl get pods in actions namespace.
Lets switch to our test repository, commit a new workflow, and push it.
Now we have five pending jobs.
As you remember, we specified the sync period as 1 minute, so there will be a little bit of lag before the horizontal runner autoscaller scales up our runners.
Alright, we now have three new runners created by autoscaller, which will allow completing our workflow much quicker.
All of the jobs are running in parallel now.
Everything is completed, and we dont have any new jobs; it will take up to 5 minutes to scale back runners in the Kubernetes.
All of them gone except one runner.
Thank you for watching, and Ill see you in the next video.
Anton Putra: GitHub Actions Self Hosted Runner (Autoscaling with Kubernetes) - IT Consulting