A couple of days ago our monitoring notified us about high disk usage on our GitLab server. So I started digging around on the server to figure out whether our projects themselves got too big, or if there are any other problems.
I started by connecting to the server via ssh and had a look inside htop to check the overall server health. All fine, so now it was time to check which files cluttered up the server.
To check the size of the directories I used the command:
This command returns a list of directories on the given path and their sizes. I repeated using this command and replaced the path with the largest directory from the previous run. After a couple of runs I ended up in /var/opt/gitlab/gitlab-rails/shared/ and noticed that the artifacts directory was using over 100GB of storage.
Artifacts in Gitlab
Artifacts in GitLab are products by CI-Jobs. They contain logs of failed tests or completed builds like a whole React application ready to serve. These artifacts can be downloaded via the GitLab UI and are passed to the next CI-Job in your pipeline.
We usually set the expiration date for artifacts so they get removed after a couple of days, but somehow not all of them got removed. After some research I learned that each artifact gets its own expiration-date. So when no expiration-date was given at the time of creating the artifact, then the artifact will never be removed. Even setting a global default expiration date had no effect on already existing artifacts.
When no expiration-date is given to an artifact - it is never removed.
A script to clean up artifacts
After searching in the GitLab Documentation I was not able to find any option to delete old artifacts, so I started writing my own small Python Script to do so! The script can be found as a gist on github!
The script requires an access token to call the GitLab API as well as the URL of your GitLab instance. When those two parameters are given, the script starts to create a list of every repository in your GitLab.
After the list is completed, it fetches every job of every repository ever run to gather information about every artifact. With this list the script now analyses every artifact and checks whether it should be deleted or not. To make this decision it checks how old the artifact is. If it is older than what is defined in delete_everything_older_than it deletes the artifact.
By default the script runs in dry-mode so nothing gets deleted but it calculates how much data would get deleted.
The script took quite some time to finish its run. In fact, I started the script, then started writing this blog post and it is still running right next to my text editor. By now it has deleted over 50GB of files we do not need anymore. So I call this small side project a success!
I hope you find these explanations useful and any feedback is highly welcome!