If you have been using Git for sometime now, I am sure working with Pull Requests is now a second nature to you. And it is also possible you have started getting irritated with merge conflicts a lot. But what if I tell you, even after merging the conflicts we can't be sure if the pull request will break our production or not? Have you ever thought, how enterprise projects are maintained? How do so many contributors work together in sync without breaking the production?
Engineering teams at Shopify, Uber, and almost every other company have faced the same issue with merging and used the same hammer to break and finish off the problem. Many engineering teams and open source projects are introducing merge queues as part of their workflows.
What exactly is the problem? π€
Combining the individual work of an enormous team of engineers into one codebase is tough. And the difficulty lies deeper than our standard merge conflicts, which version control is fairly good at handling. Thereβs a way more troublesome type of conflict we can run into when simultaneous changes are made to multiple areas of code that depend upon one another.
Let's understand the problem a bit more using GitHub. Say, we create a Pull Request which passes the CI for the main branch.
Before our PR gets merged there is another commit. Does not matter how the commit is done, be it done by merging another PR or committed directly to the main branch.
Even after the commit, the CI checks show green. Now, as amateur maintainers, we may find this to be fine, as the checks pass. So, we merge the PR, and guess what? There is a high chance we broke the production env.
The general CI checks or any version control system fails over here. Thus, the need for Merge Queue arises.
But how can this even occur? π€―
You might be wondering, even if all the checks are passing, how can merging the PR break the code at production? Imagine we add a function call in some new code, and our teammate is busy refactoring that function at the same time, changing its signature or return values. Now since the file we changed, didn't have any new changes when our friend committed, the CI would still pass for us. Version control and even tests and checks in CI canβt help with this kind of conflict, but merge queues can.
Understanding the scale of the problem π
At Shopify, developers merge and deploy over 400 pull requests each day to Shopify master. More than a million merchants depend on Shopify, one error while merging may fire up billions of dollars. The engineering team at Uber reported there is a 5% chance of an actual conflict between 2 changes, this number grows to 40% with only 16 concurrent and potentially conflicting modifications. So these conflicts can happen almost every day for large organizations.
What is Merge Queue? β£
A Merge Queue is exactly what it sounds like: it's a queue that you add our branch/Pull-Request to for it to be merged. Merge Queue is a FIFO queue that manages the merge workflow for our Github repository.
It is always easier for developers to directly commit to the master branch. But with the scaling of the project and with the increase in code complexity and contributors the need for branches arises. Merge Queue in simple terms can be thought of as the tool which gives the advantages of directly committing to the master and also the scalability of creating branches. Aside from the obvious convenience of not having to wait for builds to pass to merge, the big reason for using a Merge Queue is that it can help eliminate bad merges and save us both time and money.
Using a merge queue solves that problem by updating any pull request that is not up-to-date with its base branch before it is merged. The update forces the continuous integration system to retest the pull request with the new code from its base branch, catching any potential regression.
Workflow of Merge Queue? π
Merge queues operate on a rather simple premise. Suppose we have 2 pull requests of 2 differents features. Neither Git nor CI were wrong in saying that any of the two branches were, at one point in time passing all the checks. But those checks mean that the PR can be merged with the main branch at only that particular moment of time and might change later. Therefore, any time the main branch changes, we must reevaluate the compatibility and functionality of the PR.
Merge queues make sure every PR is inserted into the queue and is merged in the right order. Merges queues ensure that all branches/PRs are merged using a specific process. That process is:
- A pull request is approved by one or more reviewers
- The pull request is added to the merge queue
- When the pull request reaches the top, the branch/PR is updated to the main i.e. all CI checks are run again
- The pull request is removed from the queue and is either merged or closed.
How organisations are solving the issue? π€
Github offers a native merge queue solution but currently it is only available in a limited public beta. This is potentially the best solution for GitHub users as it is implemented under the GitHub Actions umbrella. For now, we need to be one of the lucky ones selected for the beta to experiment with that feature.
If you are GitLab's premium user you need not worry about the merge queue because in GitLab it is already publicly available.
Mergify provides a merge queue feature that is easy to set up and configure to our needs. It's free-forever for open-source projects and comes with a 14-day free trial for private projects.
Mergify not only helps solves problems with merge queues but has many fascinating features which can nowhere be found in GitHub.
A technical deep dive π
To understand the technical nuances in dept let's get our hands dirty and witness some of the features of Mergify.
In a big organization, solving merge conflicts is not the only problem we face. We must assign each PR for review to the right person, make sure it is reviewed before merging, set labels, rebase branches, backport PRs to different branches and the list can go on and on. You can use Mergify to solve all these problems. The amount of features Mergify provides, it is hard to fit in this one blog, so let's try out some of them. In this hands-on we will be:
- adding checks before merging
- merging automatically
- using commands to instruct the Mergify bot
- creating and using a merge queue
Let's start by creating a Github repository. I am naming it as Mergify, but you are free to do it as you want. Let us hit the particular URL to initialise our project.
https://github.com/new
Fill in the details and then click on the Create a Repository button.
Now, since we have our repository initialised let us get an account on Mergify. We get an option to start with our Github account directly with ease. When you click on the Signup button if you are new to the tool, it would redirect you to the auth page.
Once we are authenticated, we get access to the Dashboard. During the authentication we get an option to install the tool to all repositories or selected ones.
As we can see in the dashboard, the Merge Queues button helps us inspect the Pull Requests in the form of a Queue. The Editor Config option gets us started with a starter template of our configuration which we can customise according to our need. The Usage tab shows the option on who are the users linked with. The sidebar has options such as Application keys, where you can store your Private and Public keys as well.
Let us click on the button Add more repositories to add the repository we created.
Now, we can work on the Mergify repository and implement features according to our needs.
For the purpose of this hands on, let's build a basic React project to showcase the features. Once you have cloned the repository, change the working directory to the Mergify directory and then run the following command to create a new React application.
npx create-react-app .
Spin up the application on your localhost
npm start
If you find the below screen, you are good to go.
Now that everything is set up, let's start with Mergify.
The first step is to create a Mergify configuration for our repository. A simple configuration file could be something like this
pull_request_rules:
- name: Automatic merge on approval
conditions:
- or:
- "#files=1"
- files=README.md
actions:
merge:
method: merge
Let's understand this. We added a pull request rule which says Mergify should automatically merge all pull requests that have changes in any one file or in the README.md. This helps small changes to get merged quickly to the main branch.
We can use and
and or
conditions to make our rules more specific. Now to check if our configuration is correct, click on the blue button saying check my configuration
. We can also add a PR to check it against which will give us more details on the rules created.
We created a branch called header
and sent a PR to the main branch which adds a header text to the website.
Let's try checking our Mergify configuration against this PR.
It clearly shows on the right, that the rule matches since the PR only changed one file. After testing the rule we can easily add this configuration to our repository by clicking this button
If you check your github repository you will find a new PR from Mergify. Just merge the PR and your rule is added.
If you see carefully, once you merge the PR, the header PR also gets merged automatically. If you open the closed header PR, you will notice, Mergify has merged it for you.
This was a very simple configuration, but with Mergify you can add many more complex rules. The sky is the limit!
You can also instruct the Mergifyio bot to do certain tasks for you like updating the configuration file. Get a complete list of commands here
Lastly, let's look into the merge queue.
A simple usage could be to merge pull requests serially, ensuring all of them pass the CI one by one.
queue_rules:
- name: default
conditions: []
pull_request_rules:
- name: merge using the merge queue
conditions:
- "#approved-reviews-by>=2"
- check-success=Travis CI - Pull Request
actions:
queue:
name: default
In the above rule, we define a queue named default which allows any pull request to be merged once it enters it. To enter the queue the PR must have 2 approving reviews and pass the CI. Once a pull request is in first position within the queue, it'll be updated with the newest commit of its base branch.
Similarly we can have multiple queues, for example urgent and default. This feature is although currently available only for premium users. You can add labels tp PRs to take them to different queues as shown below.
Outro π
The features of Mergify are endless. The purpose of this blog was to spark an interest in this amazing tool which can save you a lot of time and efforts. Check out all the features in the documentation. Don't wait too long, create your account and get started with Mergify. In case you have some questions regarding the article or want to discuss something under the sun feel free to connect with me on LinkedIn π
If you run an organisation and want me to write for you please do connect with me π