Recently, I had a chance to work on CI (Continuous Integration) for one of the clients I’m working with. They were using a single node Jenkins server for all the automation. Microservices architecture is followed where there are roughly 20 services running in production. All of these needs to go through the CI system before being deployed to Dev, QA, and Production environments. Deployments are done manually.
The following diagram gives an idea of the entire flow.
We start with a Github webhook event whenever a dev pushes some code to Github. This triggers a Multibranch pipeline job. Now based on the branch name, if it is not a release branch, we ask a user input for non-release branches whether we want to run docker steps or not. The user input waits for 10 mins and falls back to no if not responded to. Next, we build the code, test it and run source code analysis based on the repository. Following, we check the output of the docker input, and based on that either we end the job or we will build the docker image and push it to AWS Elastic Container Registry to be deployed later.
The main issue I observed with the current setup was it had duplicate code everywhere in all of those repositories. A simple change of let’s say updating the ECR endpoint or docker image tag format meant making a change in all of the repositories. The other issue was a lot of the code was too Jenkins specific i.e. groovy scripts. This makes it hard to switch to a different CI system if needed.
The root cause of the problem was the polyglot nature of the codebase across repositories, it was no easy solution in hindsight other than replicating the code when the developers started out with few repositories. But as the organization evolved it had become a serious tech debt.
In order to solve this problem in the right way, we fixed on following requirements for the solution that we would build
The goal was to not compromise on these 3 things.
There are some knows solutions for some of these problems but none of was meeting all of our criteria. For example, Jenkins Shared Libraries allow us to move all the reusable code into a separate repository. However, this means all the shared code has to be Jenkins-specific groovy scripts. Another way, we could use bash scripts to not write Jenkins-specific code and invoke it from Jenkins but then it makes code scattered across bash scripts and no sharing.
A quick note about Jenkins Shared Libraries if you’re new to it.
Jenkins Shared libraries concept allows to put our shared jenkins groovy code into a separate git (or any other version controlled) repository structured in a specific way and configure Jenkins to know the repository location and allow Jobs to import the code as a shared library.
So we came up with a hybrid approach to pick the best parts of each world.
- Jenkins Shared Libraries for sharing common Jenkins code
- Makefiles and Bash for CI agnostic code in each repo.
- Common utility bash code included at runtime in the CI system
Now we have 3 places where have some part of the solutions.
- Repository having the job pipeline and overridden functions
- A separate repository for Jenkins Shared library
- A common bash script which has lots of common bash code hosted anywhere, in our case we host it in Github and access it using the Github Token which is usually available in a CI system.
Let’s look some of the sample code.
Code within a repository
So a typical repository code looks like below with
We are only interested in the following files:
- Jenkinsfile — Here we will have a basic skeleton of the stages involved in the CI pipeline. Minimal groovy code! Shared library will have common functions like reading an input in jenkins and a special function which will download a special bash script from Github called ci-helper.sh That file has all the common code.
- build_scripts.sh — Here we will have repository-specific steps’ implementation like build, test, source code analysis, and docker_build. Important things to notice here is we source the ci-helper.sh file at the beginning allowing us to override any parts that we want and providing defaults otherwise. The surrounding if block ensures script works for dev locally on their system.
- Makefile — Here we will have to make targets to invoke the specific steps from build_scripts.sh. Nothing fancy here but most important thing to note is that it gives the consistent developer experience for executing the code. For example a developer testing the code locally will run make test. This same command will run in CI environment. This helps to remove the drift between the execution as code evolves.
Jenkins Shared Library code
Now the only remaining part is the Jenkins Shared Library code but I have not put that code here. I think its pretty simple functions written in groovy which are common to every job plus with a special function of downloading ca-helper.sh file using Github raw content API. So I’ll spare that to keep this article short.
With this approach, I was able to reduce roughly 150 lines of working Jenkins-specific code for every repository so I would like to consider this as pretty good refactoring I have done in recent times.
Now one more improvement I see is how to reduce the wait for 10 mins. Since it is triggered by Github automatically unless the user takes an action the job still waits for 10 mins and yeah reducing the time is just a workaround but I’d like to fix it in a better way. One way I have in my mind is to add a Slack bot that can take an answer in a form of a slack message where people tend to respond quickly than going to Github/Jenkins UI. But that’s for some other day!
If you have any other comments/ideas, I’d love to hear them.
Originally published at https://sitaram.substack.com.