CI/CD server utilization pattern differs from typical server patterns. The reason for that lays in the nature of the workload that such servers usually execute. There are two major types of such workloads which commonly present:
- build, test and deploy artifacts;
- execute deployed test environments for git branches, commits, configurations, etc.
Usually, it is better to split the workloads listed above, so that they would run on different servers. The main argument for doing that is based on the expectation that the build workload strives to utilize as much compute resources (CPU, IO) as possible to complete the job as soon as possible. When the workload of the second type is also run on the same server, engineers must enforce limiting policies to protect deployed environments from the starvation caused by excessive server resources utilization induced by the build workload.
In the modern Docker-based infrastructure, it is easy to achieve that by limiting CPU cores dedicated for the build task. And CI/CD environments allow limiting parallelism for executed build tasks, organizing them in the queue, but this leads to the inefficient computing resource limitation and could lead to longer builds. So, first of all, try to split those workloads. If you are using such systems as Kubernetes, Docker Swarm or DC/OS, or provision build servers dynamically in some other way, then you probably do not find out the problem when your build tasks lead to the starvation of deployed services if clusters are designed properly.
Build Servers
Well, suppose you decided to provide certain servers for the build tasks. Let’s discuss what is needed to achieve better performance.
Local Docker Caching Registry. If your build process is based on Docker, you usually spend some time downloading layers. Local cache helps to decrease the time significantly. Remember, that until the layers are not downloaded, the build system cannot build artifacts and the CI/CD system will be blocked for other tasks.
Local Docker Registry. If one uploads the artifacts for future deployments, the local artifact repository helps to do it faster. That is especially important when you deploy your apps in the form of dockerized applications and use tags to roll out branches, commits, environments. Remember, that until the artifact is uploaded the CI/CD task is busy and the server is unable to handle the next task.
HTTP Cache Server. Modern applications depend on many 3rd-party libraries. If you ever saw how Maven builds projects, you know every Java-based build project downloads tons of JARs. The same is true for Node, Python, Ruby and other popular development environments. As all those artifacts are downloaded from various servers which can be under attack, high load or maintenance, it could lead to a significant slow-down or even build failures. Local HTTP Cache resides in a fast local network and is controlled by your organization and helps to perform your build CI/CD system even in cases when original servers are out of service.
Fast and Spacious Local File Cache. Reserve plenty of disk space for caching dependencies locally. You can use NFS server or GitLab CI cache service which helps you avoid downloading dependencies from the network. Connect servers to the cache with at least 1Gbit/s link which guarantees fast artifacts access when needed. Consider using NFS FS-Cache to reduce the number of NFS read operations.
Fast Local Filesystem and plenty of CPU cores. The building process is bound mostly to CPU and filesystem operations which means that you should consider using servers with many cores with higher frequency and fast SSD storage if you want to get builds ready fast. If you cannot deploy fast SSD storage, consider using more RAM as it can be used as a file cache which will help to decrease the number of the filesystem read operations. Do not use sophisticated CoW filesystems like ZFS, BTRFS, prefer using EXT4 or XFS filesystems.
These steps help to improve the build velocity significantly decreasing the time when a CI/CD pipeline is waiting for a network or filesystem.
Deployment Servers
Testing and staging servers usually have other requirements than the build servers. They generally have a lot of various environments deployed, especially if a modern CI/CD process is in use when the branches, commits, configurations are deployed and accessible through a specially designed routing mechanism that can be based on IP, domain names or L7 routing (NGINX, Traefik, Kong or HA Proxy).
The most characteristic feature of the testing and staging servers is the presence of many processes which are in RAM but idle most of the time. E.g. the branch-X is deployed but waits for the QA division to run regression tests (if the manual QA is in use) or the pre-release-Y is deployed but waits for stakeholders to get a demonstration. Automated test suites can be executed for some of those deployed branches.
To address the needs, such testing and staging servers require moderate CPU resources, but a large amount of RAM. Thus, they can be expensive because usually, cloud providers provide balanced service offerings and you have to pay for CPU as well. If you choose on-premise servers, then you also need to invest in servers with non-standard configurations when the amount of RAM is pretty high, while CPU capacity is moderate.
This is not so widely known, but these limitations can be overcome with the use of special RAM compression mechanisms, NVRAM and modern NVMe devices. Let’s observe these means in detail. All solutions are designed for the GNU/Linux environment. Other operating systems probably have similar mechanisms.
Infrastructure Components Optimization. First, let’s address the problem of deploying infrastructure components like DBMS, queue brokers, search engines, balancers, and other system components. Generally, it does not make sense to deploy dedicated systems for every deployed branch, commit, environment. One must consider developing of an automated deployment scenario, but it should not be embedded into CI/CD pipeline. To distinguish the objects which relate to certain deployments, use prefixes or namespaces. E.g. for a database name one can use prefix <branch>_dbname
, so every branch gets a separate database. This is the right approach because it accounts two ideas: you do not need to test reliable infrastructure 3rd-party components, and a single instance of an infrastructure component helps saving RAM, space and CPU resources.
The suggested approach usually does not limit developers from the perspective of CI/CD deployment. To implement it right, one needs to create a separate script to deploy the infrastructure components, which also allow wiping everything and deploy again to check that all the aspects are under control. To improve the process and share the knowledge, a new release manager should be assigned for certain iteration while previous release manager helps him or her to deploy the infrastructure and applications.
Applications-related Optimizations. Linux servers provide several cool mechanisms to compress the RAM which can be used to condense applications in the server memory. The deduplication happens in four places:
- storage layer memory deduplication;
- shared objects deduplication;
- general system memory deduplication;
- general system memory compression.
Storage Layer Memory Deduplication. The deduplication works in situations when certain file objects are read-only accessible by many processes. The well-known optimization like that is used by Docker when the storage backend is configured to use OverlayFS2, ZFS or Aufs but it does not work with BTRFS and Device-mapper storage drivers.
Shared Objects Deduplication. The Linux kernel deduplicates read-only code objects in memory if they are compiled as *.so libraries. This deduplication is fully automatic and is managed by the process management system without the need to tune it. This optimization also works for Docker-based processes, but will not work for virtual machines managed by KVM or other hypervisor systems.
General System Memory Deduplication. Linux supports KSM (Kernel same page merge) mechanism which finds pages with the same content and merges them which leads to memory releasing for merged pages. The more entities are derived from the same original image, the more pages can be merged and less system memory used. As CI/CD servers usually satisfy this criterion, KSM may be beneficial for memory optimization, especially if you run applications in isolated VM guests.
General System Memory Compression. Recent Linux kernel versions introduced two mechanisms which can be used for memory compression implementation - zSwap and zRAM. They look pretty similar but can be used in slightly different cases. zSwap enables swap compression and reserves a certain amount of RAM for a dynamic pool which is used as the first and fastest tier for a swap system. zRAM is used to create a compressed RAM disk that can be used as a device for a filesystem or as a backing storage device for a swap. Both zSwap and zRAM use very efficient LZO, LZ4, LZ4HC or deflate algorithms to compress and decompress data. LZ4 gives best results from the perspective of the compression and decompression speed, LZO provides the best compression ratio. zSwap and zRAM easily help increase RAM twice or even more for CI/CD systems, especially when a small amount of branches is in use and the rest of them are idle.
When zSwap or zRAM is in use for swapping RAM pages, the main swap storage device which is usually HDD or SSD must provide exceptional performance to avoid the bottleneck. Consider using modern NVMe devices which utilize traditional SSD technologies or Intel Optane memory.
The Conclusion
The deep and clear understanding of CI/CD processes that take place in the project helps to implement the efficient and lean usage of CI/CD server resources, very dense deployment and solid financial economy. To use the practices listed above, DevOps engineers must understand the Linux ecosystem and use estimation practices which help to understand the pros and cons from every used approach.
If you like this post and find it useful, please, share it with friends.