---
title: "Autoscalable GitLab runners on AWS EC2"
id: "4575"
type: "post"
slug: "autoscalable-gitlab-runners-on-aws-ec2"
published_at: "2026-05-21T09:41:37+00:00"
modified_at: "2026-05-21T10:12:57+00:00"
url: "https://palark.com/blog/autoscalable-gitlab-runners-on-aws-ec2/"
markdown_url: "https://palark.com/blog/autoscalable-gitlab-runners-on-aws-ec2.md"
excerpt: "Learn how you can benefit from the GitLab Fleet scaling feature to automatically scale its runners, resulting in reduced cloud costs for CI/CD. We’ll go step by step from choosing the right executor in GitLab to configuring a custom AMI..."
taxonomy_post_tag:
  - "AWS"
  - "CI/CD"
  - "FinOps"
  - "GitLab"
taxonomy_language:
  - "English"
taxonomy_mailchimp:
  - "created_newsletter"
taxonomy_author:
  - "timur.nizamutdinov"
---

# Autoscalable GitLab runners on AWS EC2

21 May 2026

By Timur Nizamutdinov, software engineer

This article shares our experience of getting rid of continuously running EC2 instances, setting up scalable GitLab Runners in AWS, and significantly cutting CI infrastructure costs. How did it happen?

At first, everything was working like clockwork, until one moment, it all went south. At some point, our GitLab Jobs started experiencing delays of 5, 10, and even 15 minutes. Pipeline queues started getting backed up, our DevOps engineers were growing nervous, our developers were frustrated, and AWS quietly kept charging us for hundreds of EC2 instance hours.

The “let’s just spin up some EC2s” trick only got us so far. Soon enough, we again found ourselves staring the very same issues in the face: idle instances, money going down the drain, and no real Job isolation. We realized that having to “fix” the issue over and over was a dead end — we had no choice but to set up real autoscaling for our GitLab Runners.

## Goals: automated scaling, lower costs

We aimed to implement an approach that would scale CI to accommodate fluctuating workloads and reduce our AWS costs.

Rather than keeping runners on “always-on” EC2s, the idea was to automatically spin up instances for specific Jobs or groups of Jobs. When a new Job came in, an EC2 instance was created. Once the Job was completed, the instance either switched to the next one in the queue or shut down automatically if there was no more work to do.

In this case, the only “always-on” EC2 we were left with was a managing instance running the GitLab Runner, responsible for launching Jobs and orchestrating them. It consumed very modest resources, so keeping it running costs next to nothing compared to the actual production workloads.

To sum up, our objectives were to:

- auto-spin up EC2 instances as soon as GitLab CI Jobs come in;
- stop or terminate an instance upon remaining idle for N minutes;
- keep Jobs isolated: one VM = one Job (or a small Job batch);
- automate Amazon Machine Image (AMI) builds to quickly pop up identical instances featuring the same software;
- manage the runners using Terraform as part of the infrastructure as code approach.

## Workflow and GitLab + AWS setup overview

Here is a helicopter view of our expected workflow:

- GitLab triggers a Job using a specific tag →
- The managing Runner picks it up and uses the Fleeting Plugin to spin up a new EC2 instance →
- The build is run on that instance →
- Once the Job is done, the instance is automatically stopped or deleted.

We have implemented it using GitLab Runner version 15.11+ with Fleet scaling support in our Ubuntu Linux-based setup. [Fleet scaling](https://docs.gitlab.com/runner/fleet_scaling/)
 is GitLab’s built-in feature for scaling runners using external resources. That means Jobs don’t run on the runner server, but on ephemeral virtual machines in the cloud (AWS in our case).

[Fleeting](https://docs.gitlab.com/runner/fleet_scaling/fleeting/)
 is the library/plugin that implements this approach by connecting the runner to the cloud. It is responsible for provisioning AWS EC2 instances upon the runner’s request, connecting to them (typically via SSM), offloading the Jobs to them, and terminating or deleting the instances once they are no longer required.

Here are the steps we’ll be following to reach our goals:

- Choose an executor. Figure out how the GitLab Runner will run the Jobs — whether right on the VM, inside a Docker container, or on ephemeral EC2s.
- Install GitLab Runner on managing EC2. This “always-on” runner will accept Jobs from GitLab and use Fleeting to provision temporary instances for them.
- Create an IAM user. Set up a dedicated User (e.g., `gitlab-autoscaler`); the runner will use its credentials to interact with EC2 and Auto Scaling.
- Configure the IAM policy. Grant the IAM user the exact permissions it needs: creating/deleting instances, interacting with the Auto Scaling Group, and reading resource details.
- Install the Fleeting Plugin. This plugin connects the runner to AWS and allows it to automatically start and stop EC2 instances.
- Configure GitLab Runner. Modify `config.toml` to set up executors, autoscaler parameters, S3 caching, idle policies (`idle_time`), and the maximum number of instances.
- Prepare the AMI for worker instances. Build a base image containing all the software we need, such as *gitlab-runner*, *docker*, *kubectl*, *helm*, etc. We will reference that AMI in the Launch Template.
- Prepare the Auto Scaling Group. Create and configure the ASG and the Launch Template: specifying the instance type, AMI, scaling parameters, and updating the image if necessary.

## Implementation

### 1. Picking the GitLab Runner executor

Before diving into the configs and setup, let’s go over a bit of theory: what GitLab Runner executors are out there and how they differ from one another. Here’s a brief technical comparison of executors:

| Executor | Isolation | Environment | Best suited for |
| --- | --- | --- | --- |
| shell | ❌ | Local VM | Basic scripts, quick tests |
| docker | ✅ | Docker on host | Frontend, unit tests, microservices |
| instance | ✅✅ | Dedicated EC2 instance | Terraform, shell jobs, Ansible, tools |
| docker-autoscaler | ✅✅ | Docker on EC2 | Containerized jobs, builds, frontend CI/CD |
| kubernetes | ✅✅ | Kubernetes Pod | Massive, scalable CI/CD infrastructures |

We will focus on the *[instance](https://docs.gitlab.com/runner/executors/instance/)* and *[docker-autoscaler](https://docs.gitlab.com/runner/executors/docker_autoscaler/)* executors, since they:

- can automatically spin up and shut down EC2 instances for specific Jobs;
- offer great isolation: one VM or one container on a dedicated instance per Job;
- use the GitLab’s Fleet Scaling API: the native autoscaling mechanism for Runners.

What about the *kubernetes* executor? While it also provides isolation and scaling, it requires you to already have a Kubernetes cluster and maintain it. That’s a huge topic that deserves its own write-up, so we’ll skip it in this guide for simplicity’s sake.

### 2. Installing the Runner on the managing instance

Now that we’ve nailed down the components, let’s move ahead to the practical part.

First, we will install the GitLab Runner on the managing EC2 instance. This “always-on” instance is responsible for:

- fetching Jobs from GitLab;
- interacting with the Fleeting plugin;
- provisioning temporary EC2 instances for Jobs.

Note that all further setup steps — installing plugins, tweaking `config.toml`, and running tests — will be **performed on this managing instance** specifically.

```
curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh" -o gitlab-repo-add.sh
less gitlab-repo-add.sh # check the script contents to be sure nothing bad will happen
sudo bash gitlab-repo-add.sh
sudo apt-get install -y gitlab-runner
```

### 3. Creating an IAM user

Before you install the Fleeting Plugin, you should be ready to provide access to AWS. This GitLab plugin needs to:

- connect to instances (usually via SSM/SSH);
- create and delete EC2 instances;
- access the Auto Scaling Group and Launch Template.

All of these are configured via IAM in AWS. The easiest way is to create a separate IAM user, e.g., `gitlab-autoscaler`. The GitLab Runner will use it to interact with AWS. These user credentials will be used to open up the AWS profile mentioned in `config.toml`:

```
profile = default
```

On the machine running the GitLab Runner, you need to add the user’s keys to the `~/.aws/credentials` file:

```
[default] 
aws_access_key_id = ... 
aws_secret_access_key = ...
```

### 4. Granting permissions to the IAM user

For the `gitlab-autoscaler` user to be able to manage resources, we must grant it the appropriate permissions. Let’s create an IAM policy — e.g., `gitlab-runner-autoscaling-policy`, and attach it to the user. This policy allows the user to:

- create and delete EC2 instances;
- read instance descriptions, images, and tags;
- use the `gitlab-runner-ao-group` Auto Scaling Group (i.e. `autoScalingGroupName/gitlab-runner-ao-group`).

This specific policy allows the Fleeting Plugin to start and stop instances in the designated autoscaling group.

Here’s a JSON policy example listing (you’ll need to replace `$ACCOUNT_ID` with your actual AWS Account ID):

```
{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Effect": "Allow",
     "Action": [
       "autoscaling:SetDesiredCapacity",
       "autoscaling:TerminateInstanceInAutoScalingGroup"
     ],
     "Resource": "arn:aws:autoscaling:eu-central-1:{$ACCOUNT_ID}:autoScalingGroup:...:autoScalingGroupName/gitlab-runner-ao-group"
   },
   {
     "Effect": "Allow",
     "Action": [
       "autoscaling:DescribeAutoScalingGroups",
       "ec2:DescribeInstances"
     ],
     "Resource": "*"
   },
   {
     "Effect": "Allow",
     "Action": [
       "ec2:GetPasswordData",
       "ec2-instance-connect:SendSSHPublicKey"
     ],
     "Resource": "arn:aws:ec2:eu-central-1:{$ACCOUNT_ID}:instance/*",
     "Condition": {
       "StringEquals": {
         "ec2:ResourceTag/aws:autoscaling:groupName": "gitlab-runner-ao-group"
       }
     }
   }
 ]
}  
```

### 5. Installing the AWS Fleeting Plugin

Now that the IAM user is ready and the keys are set, let’s install the Fleeting Plugin:

```
# Install Fleeting Plugin for AWS
echo "Installing Fleeting Plugin..."
sudo gitlab-runner fleeting install aws:latest

# Create AWS credentials directory and files
sudo mkdir -p /home/gitlab-runner/.aws
sudo chown gitlab-runner:gitlab-runner /home/gitlab-runner/.aws

# Create AWS credentials file with S3 cache credentials
sudo tee /home/gitlab-runner/.aws/credentials > /dev/null <<'EOF'
[default]
aws_access_key_id = ${aws_s3_cache_access_key}
aws_secret_access_key = ${aws_s3_cache_secret_key}
EOF

sudo tee /home/gitlab-runner/.aws/config > /dev/null <<'EOF'
[default]
region = eu-central-1
output = json
EOF

sudo chown -R gitlab-runner:gitlab-runner /home/gitlab-runner/.aws
sudo chmod 600 /home/gitlab-runner/.aws/credentials
sudo chmod 600 /home/gitlab-runner/.aws/config
```

Now with IAM and the Fleeting Plugin in place, your GitLab Runner can:

- provision and delete EC2 instances for Jobs using the IAM user credentials;
- connect to these instances via AWS Systems Manager (SSM), with no need for direct SSH access;
- run Jobs on them using either the instance executor or Docker containers;
- shut down or delete instances based on `idle policy` or when they hit the `max_use_count` limit.

***Brief note on SSM:****Every EC2 instance created by the Fleeting Plugin runs under the `AmazonSSMRoleForInstancesQuickSetuprole`. This role lets the Runner securely connect to the instance via SSM, so you don’t have to deal with SSH or manage public keys.*

The resulting workflow is as follows:

- The Runner dynamically creates instances within the specified Auto Scaling Group as Jobs come in →
- These instances are automatically assigned the right permissions and settings →
- Once the Job is done, the instances are gracefully terminated according to the defined policies.

### 6. Configuring the GitLab Runner

The next step is configuring the GitLab Runner via the `/etc/gitlab-runner/config.toml` file. This is the central configuration hub in which we define:

- the GitLab URL and the Runner registration tokens;
- the executor type for each Job group (*instance*, *docker-autoscaler*, etc.);
- autoscaling parameters: maximum instance count, `idle policy`, `max_use_count`, and more;
- cache settings (e.g., an S3 bucket for artifact caching);
- various scaling policies for different time periods (working hours, nights, weekends).

View a sample Runner configuration below:

```
listen_address = ":9252"
concurrent = 200
check_interval = 0
connection_max_age = "30m0s"
shutdown_timeout = 0
log_level = "info"
log_format = "text"

[session_server]
  session_timeout = 1800
# Our instance runner that runs Jobs right on EC2 instances (using shell)
[[runners]]
  name = "${runner_name}"
  id = 400
  output_limit = 50000
  url = "${gitlab_url}"
  token = "${registration_token}"
  token_obtained_at = 2026-02-01T12:42:42Z
  token_expires_at = 0001-01-01T00:00:00Z
  executor = "instance"

# S3 caching to speed up your builds
  [runners.cache]
    Type = "s3"
    Path = "cache"
    Shared = true
    MaxUploadedArchiveSize = 0
    [runners.cache.s3]
      ServerAddress = "s3.amazonaws.com"
      AccessKey = "${aws_s3_cache_access_key}"
      SecretKey = "${aws_s3_cache_secret_key}"
      BucketName = "walli-gitlab-runner-cache"
      BucketLocation = "eu-central-1"

  [runners.autoscaler]
    capacity_per_instance = ${capacity_per_instance}
    max_use_count = ${max_use_count}
    max_instances = ${max_instances}
    plugin = "aws:latest"
    instance_acquire_timeout = "0s"
    update_interval = "0s"
    update_interval_when_expecting = "0s"
    [runners.autoscaler.plugin_config]
      name = "gitlab-runner-ao-group2"
      profile = "default"
    [runners.autoscaler.connector_config]
      protocol_port = 0
      username = "gitlab-runner"
      keepalive = "0s"
      timeout = "0s"

    [[runners.autoscaler.policy]]
      idle_count = ${idle_count}
      idle_time = "${idle_time}"
      scale_factor = 1.5
      scale_factor_limit = 10

      [[runners.autoscaler.policy]]
      periods = ["* 10-18 * * mon-fri"]
      idle_count = 20
      idle_time = "${idle_time}"
      scale_factor = 1.5
      scale_factor_limit = 10

# The docker-autoscaler executor, designed for container-based Jobs
[[runners]]
  name = "${runner_name}"
  id = 401
  url = "${gitlab_url}"
  token = "${docker_registration_token}"
  token_obtained_at = 2026-02-01T12:43:43Z
  token_expires_at = 0001-01-01T00:00:00Z
  executor = "docker-autoscaler"
  environment = ["DOCKER_AUTH_CONFIG={\"auths\":{\"${docker_registry_url}\":{\"auth\":\"${docker_registry_auth}\"}}}"]
# S3 caching to speed up your builds
  [runners.cache]
    Type = "s3"
    Path = "cache"
    Shared = true
    MaxUploadedArchiveSize = 0
    [runners.cache.s3]
      ServerAddress = "s3.amazonaws.com"
      AccessKey = "${aws_s3_cache_access_key}"
      SecretKey = "${aws_s3_cache_secret_key}"
      BucketName = "walli-gitlab-runner-cache"
      BucketLocation = "eu-central-1"

  [runners.docker]
    tls_verify = false
    image = "ubuntu:24.04"
    privileged = true
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]
    extra_hosts = ${jsonencode(extra_hosts)}
    shm_size = 0
    network_mtu = 0

  [runners.autoscaler]
    capacity_per_instance = 5
    max_use_count = 30
    max_instances = 25
    plugin = "aws:latest"
    update_interval = "0s"
    update_interval_when_expecting = "0s"
    [runners.autoscaler.plugin_config]
      name = "gitlab-runner-ao-group-docker2"
      profile = "default"
    [runners.autoscaler.connector_config]
      username = "gitlab-runner"
      keepalive = "0s"
      timeout = "0s"

    [[runners.autoscaler.policy]]
      periods = ["* * * * *"]
      idle_count = 1
      idle_time = "20m0s"
      scale_factor = 1.5
      scale_factor_limit = 10

    [[runners.autoscaler.policy]]
      periods = ["* 10-18 * * mon-fri"]
      idle_count = 20
      idle_time = "20m0s"
      scale_factor = 1.5
      scale_factor_limit = 10
```

The `[runners.autoscaler]` sub-section and the `[[runners.autoscaler.policy]]` items are crucial to control autoscaling. They define how many instances to create at once, how many Jobs each instance can handle, how long to keep them idle, and how to scale during different times of the day.

### 7. Preparing an AMI for GitLab Runner

For the Auto Scaling Group to spin up the proper EC2 instances for your CI Jobs, you’ll need a valid AMI. This is a base image that comes pre-loaded with everything your environment might need. The minimum set of requirements for the AMI includes:

- System utilities required for running the Jobs (*docker*, *kubectl*, *helm*, *terraform*, etc.).
- Any extra agents or services; however, you don’t need to register a standalone *gitlab-runner* in the AMI, since the managing instance handles the Runner role.
- If necessary, a pre-configured `kubeconfig` located at `/home/gitlab-runner/.kube/config`.
- Pre-pulled Docker images to speed up the first build.

The easiest way is to build the first AMI manually and then automate everything with Packer later. Here’s a typical manual workflow:

1. Start with a base image like Ubuntu 22.04 from the AWS Marketplace.
2. Spin up a temporary EC2 instance.
3. SSH into the instance and install your software (*gitlab-runner*, *docker*, *kubectl*, etc.).
4. Create a `gitlab-runner` user and its home directory.
5. Add a `kubeconfig` if you need it.
6. Shut down the instance and create an image via *Actions* → *Create image*.
7. Use this AMI within your Auto Scaling Group’s Launch Template.

Once you’ve polished the manual process, move those steps into Packer so you can:

- stop messing with EC2s by hand every time you update something;
- make sure your images are consistent and reproducible;
- have a version-controlled AMI template.

Finally, you just need to feed that AMI into the Launch Template via Terraform.

### 8. Setting up the Auto Scaling Group

Our configuration uses two AWS Auto Scaling Groups: *gitlab-runner-ao-group* and *gitlab-runner-ao-group-docker*. They feature the following parameters:

- Launch Template: `gitlab-runner-autoscaler`;
- AMI: `ami-01f040934be890e5a` — this was the current image at the time of this writing; it may be updated in the future.
- Instance Type: `c7a.4xlarge` — we picked this one based on our workload and the types of Jobs we run.

*(As a side note, the GitLab Fleeting library works with other clouds as well. E.g., you can use Virtual Machine Scale Sets in *Azure* and instance groups in *Google Cloud* to implement the same approach.)*

Here’s how you can update the AMI — for example, to add a new `kubeconfig` or bump some tool versions:

1. Spin up an EC2 instance based on the existing AMI.
2. Make your changes, like updating `/home/gitlab-runner/.kube/config` or installing new software.
3. Stop the instance and create a new image from it.
4. Go to *Launch Templates* → `gitlab-runner-autoscaler`.
5. Create a new template version that features your AMI and set it as the default.

After this, the Auto Scaling Group will automatically start using that image for all new EC2 instances.

We used to do this manually for a while, but then we automated the whole process with Packer and Terraform. Now, updating the template is as simple as:

1. Rebuilding the AMI with Packer.
2. Running `terraform plan` and `terraform apply` to update the Launch Template and related resources.

In the end, the GitLab Runner infrastructure can be boiled down to the following:

- **two Auto Scaling Groups** for different types of Jobs, each with its own runner token;
- one “always-on” **Runner Instance** that handles scaling and communicates with GitLab;
- **a custom AMI**, built with Packer, that includes all the required software;
- **IAM roles** to grant the required permissions;
- **CloudWatch Alarms** for monitoring cluster state and load.

The only remaining weak point is tweaking the configuration of the managing Runner instance. If we need to modify `config.toml` or other system settings, we have to either:

- delete the current instance so the Auto Scaling Group can spin up a new one with the updated configuration, or
- carefully apply the changes manually on the running instance.

Luckily, we don’t need to do that very often. If you have a more elegant approach in mind to updating your managing Runner configs (like Ansible, SSM, or configuration drift control), I’d love to hear about it in the comments.

## Useful considerations

If your Jobs use `kubectl` (e. g., for `helm install` or deployment scripts), the `kubeconfig` file must be pre-installed in the AMI. Without a valid `kubeconfig`, a new EC2 instance won’t be able to connect to the Kubernetes cluster.

You can store the config file at the regular location, `/home/gitlab-runner/.kube/config`, and make it part of the AMI. That way, any new EC2 instance created from that image will be able to use it.

While building your AMI, it’s also a good idea to pre-pull the base container images that are often used in CI. This helps to:

- shorten the first build time on a new instance;
- reduce dependency on external registries.

As a result, an instance with a pre-configured `kubeconfig` and pre-pulled Docker images will start faster and be ready to handle Kubernetes- and Docker-dependent Jobs right away.

## Key takeaways with pros & cons

An autoscaler is a great option if you want to avoid keeping EC2 instances running 24/7 just for GitLab Runners. Once a Job comes in, an instance spins up, and once the Job is done, the resources are freed. Everything is transparent, manageable, and scales well.

Our approach with scalable GitLab Runners and custom AMIs has the following pros:

- Runners only start up when there are Jobs to run, saving a lot of money on AWS resources.
- Strong isolation and security: you can set up a “1 EC2 = 1 Job” model or assign a small pool of Jobs to each instance.
- AMIs are easy to update and rebuild with Packer, so the infrastructure remains reproducible.
- Great for high-load CIs: when the load spikes, more instances are simply created to handle it.

However, it comes with the following limitations:

- Editing `config.toml` on the managing instance by hand is a pain. Any change means you either have to recreate the instance or push the configuration into the instance somehow.
- The AMI requires regular updates for the OS, packages, and tools.
- Connecting via SSM requires the correct IAM role and the SSM agent. If the role or its configuration is wrong, you can lose access if you don’t have SSH as a fallback.

What else can be improved?

- Instead of being embedded to AMI, `kubeconfig` can be pulled from AWS Secrets Manager when the instance launches.
- Multiple autoscalers for different GitLab tags: separating frontend, backend, infrastructure jobs, and resource-hungry pipelines.
- Different AMIs for different stacks: dedicated images for Java, Node.js, Python, infrastructure tools, etc.
- Integration with EFS or S3 for caching, configuration of shared volumes, and enabling/disabling shared runners on a schedule (via cron/policy).
- Full automation via Terraform (ASG, Launch Template, IAM) and Packer (AMI builds) to ensure that every change is described as code and undergoes review.

Perhaps your specific use case will need some other customizations to benefit from this approach — feel free to share your thoughts and experience in the comments below!

## Related articles

7 October 2024

### [Kubecost with AWS integration: Implementing and automating with Terraform](https://palark.com/blog/kubecost-aws-terraform-automation/)

15 April 2020

### [GitLab CI: six new features we have been waiting for](https://palark.com/blog/gitlab-ci-six-new-features-we-have-been-waiting-for/)

29 September 2021

### [Preview dynamic environments in Kubernetes. Theory and practice](https://palark.com/blog/preview-environments-in-kubernetes-gitlab-ci/)
