GitLab CI/CD for Docker Builds and ArgoCD Deployments on Kubernetes
by Dennis Tyresson, on Jan 25, 2024 10:56:07 AM
DevOps Insights with Dennis
Did you know about the Opslogix DevOps Upskill Program? Through the program, skillful IT consultants improve their DevOps knowledge through a combination of theoretical and practical training.
In this blog series, our DevOps consultant Dennis will share some of the insights he has gained through the program. This is the fourth blog post in the series, read the previous blogs by clicking the links below:
- Kubernetes cluster orchestration with Ansible and Terraform on Proxmox VE
- GitOps - the way of the Kubernetes professional
- Harnessing MetalLB - A Deep Dive into Kubernetes Load Balancing
In this blog post, you will learn more about Dennis' insights on Efficient DevOps - GitLab CI/CD for Docker Builds and ArgoCD Deployments on Kubernetes.
Introduction
In the realm of modern development, containerization has become a pivotal practice for ensuring consistency and portability across different environments. Docker, a leading containerization platform, simplifies the packaging and deployment of applications. This article starts by walking you through the process of building a Docker container for our demo Python Flask web application, demystifying the Dockerfile, and illustrating how to effortlessly create and run your container.
We then dive into the technical intricacies of leveraging GitLab CI/CD to automate the process of building a Docker image and seamlessly deploy it to a Kubernetes cluster using ArgoCD. The step-by-step guide outlines the creation of a GitLab pipeline, detailing the Docker build process, image storage in a private registry, and the subsequent deployment orchestration with ArgoCD.
Dockerfile Demystified
A Dockerfile is essentially a script that contains instructions for building a Docker image. It defines the environment, dependencies, and configuration necessary for your application to run within a Docker container. Let's create a basic Dockerfile for a Python Flask app:
FROM python:3.11-slim
# Define virtual environment
ENV VIRTUAL_ENV=/opt/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
# Install dependencies:
COPY requirements.txt .
RUN pip install -r requirements.txt
# Run the application:
COPY . /app
WORKDIR /app
CMD gunicorn wsgi:app --bind 0.0.0.0:8080 --log-level=debug --workers=2
EXPOSE 8080
Let's break down the key components of this Dockerfile:
- FROM: Specifies the base image, in this case, python:3.11-slim.
- ENV: Sets environment variables; here, it defines the Python virtual environment path (instead of using activate).
- RUN: Executes commands in the container (in this case, creating a virtual environment and installing Python dependencies from requirements.txt).
- COPY: Copies the current directory's contents into the container at the specified directory (/app).
- WORKDIR: Sets the working directory inside the container to /app.
- CMD: Specifies the default command to run when the container starts.
- EXPOSE: Informs Docker that the application will listen on port 8080 at runtime.
Building and Running Your Docker Container
-
Build the Docker image: Open a terminal in the directory containing your Dockerfile and run:
# Local build or to push to Docker Hub (docker.io/library)
docker build -t pyttpass .
# Tag using private registry
docker build -t <registry>/<repository>:<tag> .
# Push image
docker push <image name>This command builds an image named pyttpass from the current directory (.). The flag -t lets us tag the image with a name and tag, and also specify a private image registry to which to push the image. When the image is built it can be pushed to registry by using docker push.
-
Run the Docker container:
docker run -p 8080:8080 pyttpass
This command runs the container, mapping port 8080 from the container to port 8080 on your host system.
Access your Flask app: Open a web browser and navigate to http://localhost:8080. Voilà! Your Python Flask web application is now running inside a Docker container.
Dockerizing your Python Flask web application with a well-crafted Dockerfile simplifies deployment and ensures consistency across different environments. This containerized approach enhances collaboration and scalability, making your application ready for deployment in various hosting environments, from local development machines to cloud-based platforms. Dive into the world of containerization, and empower your Flask apps for a seamless and portable future.
Setting the Stage: GitLab and DevOps Pipeline
GitLab serves as the cornerstone of our DevOps workflow. Its integrated features streamline version control, continuous integration, and continuous delivery (CI/CD). We begin our journey by defining a comprehensive pipeline in GitLab, orchestrated to automate key stages of development. This simple GitLab CI/CD pipeline consists of three stages: build, test, and deploy. The "build" stage builds the Docker image, the "test" stage utilizes Trivy for vulnerability scanning, and the "deploy" stage uses Git to deploy the application to the Kubernetes cluster by ArgoCD. Below I will break down the gitlab-ci file in the order of execution, found in the demo app Pyttpass GitHub repository.
default:
image: ruby:3.1
variables:
REGISTRY: gitlab.domain.com:5050
REPOSITORY: dennis/pyttpass
VERSION: "1.0"
RELEASE: "$VERSION-$CI_COMMIT_SHORT_SHA"
IMAGE_NAME: $REGISTRY/$REPOSITORY:$RELEASE
GIT_INFRA_DIR: /home/gitlab-runner/git/pyttpass-k8s/deployment # Gitlab repository for ArgoCD GitOps
DEPLOYMENT_FILE: pyttpass-pod.yaml
workflow:
rules:
- if: $CI_COMMIT_MESSAGE =~ /-draft$/ # Do not run if commit message end in -draft
when: never
- when: always
stages:
- build
- test
- deploy
In this first section, we define the default image used by the GitLab runner (a runner is an application that works with GitLab CI/CD to run jobs in a pipeline) using the Docker executor (the environment used by the runner). The default image can be configured per runner, but here it is explicitly defined in the CI file for clarification. This would be useful if we have different runners with different default images that does not match our requirements. Next we define our variable, which are used to help reuse of text in the file, and also make it easier to change things at several places in the file from a single point. I also find it handy for reuse of the CI file for other projects.
Next is the workflow rules, that define when the pipeline is allowed to trigger. In this configuration, the pipeline triggers on anything, besides git commits ending in "-draft". This is to allow commits without needlessly triggering the pipeline. Rules can also be applied to individual jobs, for example if we do not want to run a specific job when committing to a certain branch.
Last in this section is are the stages. Stages allows you to organize and structure the execution of jobs in your pipeline and define the phases through which your pipeline progresses. Each stage represents a logical grouping of related jobs. This approach facilitates a clear and sequential flow of tasks, making it easier to understand, manage, and troubleshoot your CI/CD processes.
Automate the building process
build-job:
image: docker:24.0.7
tags:
- docker
stage: build
services:
- docker:24.0.7-dind
before_script:
- echo "Logging in to registry..."
- echo "$CI_REGISTRY_PASSWORD" | docker login $REGISTRY -u $CI_REGISTRY_USER --password-stdin
- echo "Login complete."
script:
- echo "Compiling the code..."
- docker build -t $IMAGE_NAME -t $REGISTRY/$REPOSITORY:latest .
- docker push $REGISTRY/$REPOSITORY --all-tags
- echo "Compile complete."
after_script:
- echo "Cleaning up."
- docker rmi -f $(docker image ls -q $REGISTRY/$REPOSITORY:latest)
The build-job defines the Docker image build and push process. In my Gitlab environment, I use a single self-managed runner with both the shell and docker executor installed. By tagging the job, Gitlab knows that we want to use a runner with the docker executor. First, we tell the runner to use the docker:24.0.7 image which has Docker CLI installed to run Docker commands. The dind service spins up a Docker daemon to build images. In the before script, the executor logs into the container registry using credentials from CI variables.
The Dockerfile is built and tagged with the variable $IMAGE_NAME and :latest tags. The same image can have multiple tags and is uniquely identified by its ID. The image is pushed to the registry under all tags, which will replace the old :latest image without destroying the actual image ID. This way, we can always access the latest image using the latest tag and preserve all previous builds. Local images on the runner are cleaned up to save disk space by using docker rmi. The -q flag only prints image ID's, which lets us remove all images with the same ID as latest.
Automate container deployment
deploy-job:
stage: deploy # It only runs when *both* jobs in the test stage complete successfully.
environment: production
tags:
- shell
before_script:
- echo "Deploying application..."
script:
- cd $GIT_INFRA_DIR
- 'sed -i "s|image:.*|image: ${IMAGE_NAME}|" $DEPLOYMENT_FILE'
- 'sed -i "s|value: 1.0.*|value: ${RELEASE}|" $DEPLOYMENT_FILE'
- git add .
- git commit -m "Updated from Pipeline job $CI_JOB_ID"
- git push
after_script:
- echo "Application successfully deployed."
The deploy-job handles deploying the new image. For this job, we are using the shell executor and start by navigating to the work folder in which the local git already lives - this way the directory is reused and only needs to push the changes. A more scalable approach would be to pull the repository content into a temporary docker container using the docker executor, but my lab environment is small enough to support the former.
While in the git directory, the Kubernetes deployment YAML file is updated, substituting the new $IMAGE_NAME using Linux sed command. The app version label is also updated to match the $RELEASE tag. These changes are then committed and pushed to a second Gitlab repository used by ArgoCD (in the demo repository on GitHub everything is lumped together for demonstration purposes). This triggers ArgoCD to tell the Kubernetes cluster to pull the new image and deploy it. The GitOps workflow allows controlling deployment through code changes. No need to directly access or edit the cluster state.
So in summary, the build job handles building and pushing the image, while the deploy job updates the manifests to deploy the new image. The use of GitOps and YAML templates allows a structured, auditable deployment process.
Scanning for Vulnerabilities
Security is paramount. Trivy, a vulnerability scanner for container images, is integrated into our pipeline to ensure that our Docker images meet security best practices. Vulnerability scanning helps identify and remediate potential risks before deployment, and is the minimum amount of security anyone who aims to build or use a docker image should implement.
include:
- template: Jobs/Dependency-Scanning.gitlab-ci.yml
- template: Jobs/Container-Scanning.gitlab-ci.yml
container_scanning:
tags:
- docker
variables:
CS_IMAGE: $IMAGE_NAME
Gitlab offers several built-in tools for security scans. By including templates in the CI workflow, the scans will be run during pipeline execution. The templates included here will run in the `test` stage and variables and tags are overridden to match this specific environment. We will look deeper into this in future blog posts.
Kubernetes Deployment with ArgoCD
With the Docker image validated, the deployment to a Kubernetes cluster is orchestrated using ArgoCD. ArgoCD simplifies and automates the deployment process, ensuring that the application runs seamlessly across the cluster. ArgoCD achieves this by constantly polling a configured git repository for changes. This git repository contains our Kubernetes manifest files. Below, I will explain them one by one.
apiVersion: v1
kind: Pod
metadata:
name: pyttpass
namespace: pyttpass
labels:
app: pyttpass
spec:
containers:
- name: pyttpass
image: gitlab.lan.mydomain.com:5050/dennis/pyttpass:1.0-ad069b2f
env:
- name: VERSION
value: 1.0-ad069b2f
- name: SECRET_KEY
value: ExAmPle5ecrET # For demo purpose. Should be provided as secret!
ports:
- containerPort: 8080
Let's start by looking at the pod definition.
In the metadata field, we specify the name of the pod as well as the namespace and labels - both of which are Kubernetes organizational properties. Then we define the actual container that lives inside the pod, by firstly giving it a name and specific image to use. Next, we specify environmental variables; the value VERSION is used inside the web app to display the current build version when browsed. After that, we have a SECRET_KEY value, which is a key used by Flask to encrypt session data. Normally this would be protected using Kubernetes built-in secret management or an external service like HashiCorp Vault, but that is beyond the scope of this exercise. Finally, we specify our container port, to which the container will listen for incoming traffic and should correspond to the specified port by our web app.
apiVersion: v1
kind: Service
metadata:
namespace: pyttpass
name: pyttpass-service
labels:
app: pyttpass
spec:
type: NodePort
selector:
app: pyttpass
ports:
- name: http-pyttpass
protocol: TCP
port: 8080
targetPort: 8080
nodePort: 30007
Thereafter we look at the service manifest.
The Service acts as an internal load balancer and ambassador for the pyttpass Pods, allowing network traffic to reach the application. It provides a stable, dedicated IP address within the cluster and DNS name for the pyttpass Pods so that other applications can discover and connect to pyttpass. The Service selects Pods labeled with "app: pyttpass" using the selector field. This links it to the pyttpass Pods created by the pyttpass Deployment.
It exposes pyttpass on port 8080 internally. This matches the containerPort that the pyttpass Pods are listening on. It also exposes pyttpass on a randomly assigned NodePort (port 30007 in this case) on each cluster Node for external access. The Service proxies incoming connections on the NodePort to pyttpass Pods on port 8080. This allows external clients to reach pyttpass through the NodePort on any of the Kubernetes cluster nodes.
Conclusion
Incorporating a DevOps pipeline into your workflow using GitLab, Docker, Kubernetes, and ArgoCD enhances collaboration, accelerates development cycles, and improves the overall quality and security of your applications. By automating key processes, you empower your team to focus on innovation while ensuring a smooth and secure deployment pipeline from development to production.
This blog article dives into the technical intricacies of leveraging GitLab CI/CD to build a Docker image and seamlessly deploy it to a Kubernetes cluster using ArgoCD. The step-by-step guide outlines the creation of a GitLab pipeline, detailing the Docker build process, image storage in a private registry, and the subsequent deployment orchestration with ArgoCD. By following this comprehensive tutorial, developers can enhance their deployment workflow, ensuring consistency, automation, and GitOps principles in the development lifecycle.
Do you want to learn more about the Opslogix DevOps Upskill Program?