One repository, many AWS Lambda functions: Conditional deployment with GitLab CI

Published in

Level Up Coding

4 min readJan 19, 2021

At textcloud, we run the heart of our workflow automation platform on AWS Lambda. This allows us to offer our customers a fast and reliable experience. No matter if a user runs a simple workflow once a week or needs to apply sentiment analysis to thousands of Airtable rows, our infrastructure needs to be extremely scalable.

We store the code of all our Lambda functions in a single repository. Though a monorepo works extremely well for our use-case where all functions share a lot of code, it has downsides when it comes to our CI/CD pipeline: We only want to build and deploy the functions that have changed in a commit, not all of them. This article shows how we solved this problem.

If you are interested in this topic, you can find some of my other articles here:

The Goal: Only deploy projects that changed

This short article explains how to configure GitLab CI for a single repository with several sub-projects. The goal is to only run the build/deploy pipeline for the sub-projects that have changed in a commit and ignore the rest.

Another requirement is that every sub-project is run as a different job. This gives us better control and allows us to trigger only certain deployments manually. It also speeds things up because all deployments can run in parallel.

Like I mentioned earlier, this article builds on my previous tutorial on deploying serverless functions from GitLab CI, so I won’t go into details on the actual deployment code.

Step-by-step: Writing the GitLab CI config

What has changed?

First, we need to get a list of all files that have changed in the current commit. We can do this with the following command:

git diff-tree --no-commit-id --compact-summary --name-only -r $CI_COMMIT_SHORT_SHA

The output looks a bit like this:

lang_detection/poetry.lock
schedule/poetry.lock
sentiment/package.json
template/{{ cookiecutter.project_slug }}/serverless.yml
translation/poetry.lock

We don’t actually care what kind of changes there are (add/remove/modify), any change should trigger a new deployment.

The next step is now to extract the root folder of each project. We can do this by piping the output to grep: grep -oE "^\\w+" .

This gives us a list of all changed folders. But we still have to clear it up a bit to remove duplicates: sort | uniq

Wrapped up, the following command gives us the names of all the root folders that contain changes:

git diff-tree --no-commit-id --compact-summary --name-only -r $CI_COMMIT_SHORT_SHA | grep -oE "^\\w+" | sort | uniq

Inheritance in GitLab CI

Now let’s get into the GitLab configuration file: Our strategy is to create one general job for the deployment and additionally one job per project. Each project job should inherit the script from the deployment job while specifying the name of its root directory.

Our deployment function can look like this:

Let me explain the steps:

First, we make a list of all the folders that have changed.
Then we check if the current directory is part of this list.
We wrap our deployment script into an if-statement and run the deployment, in this case: cd $CURRENT_PROJECT && sls plugin install -n serverless-python-requirements && sls deploy

For each folder, we can then define a much smaller job:

deploy_sentiment:
  extends: .deploy
  variables:
    CURRENT_PROJECT: sentiment

In this example, the project folder is called ‘sentiment’ and it contains the Lambda function that we use for sentiment analysis. We inherit the code from the .deploy function and set a variable that specifies the folder we wanna check.

If we now push new changes to GitLab, it will run the deploy_sentiment job, check the list of changes, and then decide whether to deploy or not.

But this job definition still looks a bit redundant: We have the name of the folder already in the job title, why do we need to set it again as a variable? Luckily, there is the $CI_JOB_NAME variable that is present in the deployment script!

We simply add the step export CURRENT_PROJECT=$(echo $CI_JOB_NAME | sed 's/deploy_//g') to strip the deploy_ prefix from the job name and get the directory name instead.

The final GitLab CI configuration

Happy coding! Or if AWS already reduced you to tears, check out textcloud and see how workflow automation + natural language processing helps your company save time and money by automating complex jobs :)

Feel free to reach out to me if you have trouble getting AWS Lambda to work with GitLab CI!