I'm sure you can imagine the importance of versioning code, so that we can revert changes and recover lost data among other possibilities.
I bet you know someone (not me hehe) who does version control with their files by creating copies of them with increasingly creative names...
This was probably how anyone would do version control with their code as well before 1972, with the release of SCCS (Source Code Control System), one of the first centralized version control software ever released.
But we're not here to talk about SCCS, what really interests us now is GIT, the distributed open-source version control software that celebrates its 20th anniversary next year (07/04/2005).
Table of contents
- 1. What's GIT?
- 2. How does GIT work?
- 3. Installing GIT
- 4. Configuring GIT
- 5. Starting a local repository
- 6. Working with GIT
- 7. Knowing branches
- 8. Syncing with the remote repository
- 9. Conclusion
- 10. References
1. What's GIT?
GIT is an open-source distributed version control system launched in 2005 and developed by Linus Torvald (yh, the Linux kernel creator).
With GIT, we can locally control the versions of a project (in the working folder) and synchronize all changes to a remote repository (on GitHub, for example).
2. How does GIT work?
Imagine a physical file cabinet where there's a folder with all the project files. Whenever someone needs to manipulate a file, they have to pick it up, removing it from the folder and returning it to the folder after finishing. So, it is impossible for two people to work on the same file, completely avoiding any possible conflicts.
BUT THAT'S NOT HOW GIT WORKS! (thank God)
This is how a CENTRALIZED version control system works, in which the user needs to "check-out" and "check-in" files, i.e. whenever someone needs to work on a specific file, they need to check-out that file, removing it from the repository, and then check-in the file once the work is done, returning it to the repository.
In a DISTRIBUTED system like GIT, it's possible for several people to access files from the same remote repository. Whenever someone needs to manipulate a file, they can simply clone it (or clone the entire repository) locally to their machine, and then send the modifications back to the remote repository. This makes it possible for several people to work on the same project, even manipulating the same files.
This is what allows the distribution of large open-source projects, with people from different parts of the world working on the same project, managing modifications and possible conflicts (yes, merge conflicts can happen here).
3. Installing GIT
GIT is available for the main operating systems (Windows, Linux, MacOs...) with a very simple installation process, which can be done by command line or through the official installer at git-scm.com.
3.1 On Windows
To install GIT on Windows, simply go to the official website and download the installer. Then just follow the instructions and everything should be fine then we'll be able to use the GIT commands in our terminal.
3.2 On Linux
For Linux, we can install GIT using the command below:
sudo apt install git-all
By doing this, GIT must be ready to run in our terminal.
3.3 On MacOS
For Mac, the easiest way to install GIT is to install Homebrew and then run the command below in the terminal:
brew install git
Then GIT must be ready to run in our terminal.
4. Configuring GIT
After installing, it's important to configure GIT with the commands below:
git config --global user.name "[username]"
# e.g. John Doe
git config --global user. email "[email@email.com]"
# e.g. johndoe@email.com
Also, it's possible to configure specific users for certain local repositories by removing the
--global
tag.
5. Starting a local repository
With GIT configured, we can start our local repository. To do this, we can start a new repository from scratch or clone an existing remote repository.
5.1 Starting from scratch (git init)
To start a new repository, simply navigate to the desired repository root folder and run the command below:
git init
By doing this, a .git
directory will be created inside the project folder, which will be responsible for the version control in the working folder of this local repository.
5.2 Cloning an existing repository (git clone)
Cloning an existing remote repository is as easy as starting a new one from scratch. To do this, simply use the git clone
command, with the remote repository URL to be cloned inside the folder where we want to download the repository:
git clone [repository-url]
Then the entire repository must be cloned to our local machine and automatically linked to the related remote repository.
With a cloned repository, we will no longer need to use the
git remote
command in the future.
6. Working with GIT
Within our local repository, we can create the files needed for our project, but they won't be automatically synced by GIT, we'll need to report it when there are any changes to be versioned.
Thus, we can manipulate the files as we wish and after finishing the desired changes, send the updated files to GIT.
To do this, it is important to understand that there is a 3 stage infinite flow (yes, infinite) in version control:
MODIFY -> STAGE -> COMMIT
MODIFY: The first stage of version control, this is where we find the files that have changed compared to the last available version.
STAGE: The second stage of version control, this is where we place modified files that we want to add to our next commit.
COMMIT: Final stage of version control, when we confirm the changes, sending the modified files that were in stage to the local repository.
After committing the modified files, we have a new version available in the local repository, which can again receive updates, going one more time to "modified", then placed in "stage" and, again, being "committed", confirming a newer version and so on (and therefore, "infinite" lol).
It's important to notice that a commit doesn't overwrite the old version of the modified files, but includes the new version with a pointer to the last version, thus keeping track of the versions of each file tracked by GIT.
6.1 Adding and commiting (git add and git commit)
Although it might sounds complex, performing the version control flow is very simple. Since the desired modifications are completed, we add the modified files that we want to commit on stage with the command below:
git add [filename]
git add -A
-> adds all modified files to stage at once.
git add *.[extensão-do-arquivo]
-> adds all modified files with the specified file extension to stage at once (e.g.git add *.html
)
We can check our current local repository status at any time using the git status
command:
Note that when we run git status
inside the repository after creating a new file, the new file is shown as "Untracked". This means that this file is fresh new and still needs to be added to any commit in order to be tracked by GIT.
It's possible to let GIT ignore specific files or folders within the repository. To do this, we can simply add a file to the root folder called
.gitignore
and write the name of the files or folders that should be ignored inside it.CAUTION: Ignored files and folders will no longer appear to the GIT track, not even as "Untracked". To reset the tracking, simply delete the desired names from the
.gitignore
file.
To include a file, we can run the git add
command with the name of the file that we want to add ("index.html" in this case):
This way, by re-running git status
we can see that the new file has been added to "stage" and is finally ready to be sent in our next commit, which can be done using the command below:
git commit -m "[descriptive-message]"
Commits have unique IDs (hash codes) and are IMMUTABLE, i.e. they cannot be modified once they have been confirmed.
git commit -a
-> performs a direct commit, adding all the modified files to stage and committing them.
After successfully committing a file, when running git status
we notice that there are no more modified files to be uploaded, since all the modifications were effectively saved in our local repository with our last commit.
Also, it's possible to verify the changes made by reviewing the repository's commit log, using the git log
command, which shows some metadata of all the commits made, such as the hash code, branch, author, date, etc.
This whole process can be repeated to add new files that are needed for our project, modify them and send them to the local repository by making new commits.
git log -N
-> displays a log with the last N commits.
git log [branch-A] [branch-B]
-> displays a log of commits that are in "branch-B" but not in "branch-A".
git log --follow [filename]
-> displays a log of commits that changed the specified file, even if it has changed its name.
git diff
-> lists the changes made compared to the latest available version in the repository.
git diff [nome-do-arquivo]
-> lists the changes made to the specified file in relation to its last available version in the repository.
6.2 Undo changes before and after commiting
Before a commit is made, any changes made to the local repository can be undone or changed, but once the commit is made, it cannot be changed. This is because commits are immutable objects, meaning that it is impossible to edit or change the data in a commit.
However, it is possible to make new commits that undo changes, or correct incorrect information in previous commits. In either way, we can use one of the commands below:
git checkout -- [filename]
# Discards changes made to the local file before the commit (irreversible action)
git reset --hard HEAD
# Discards changes made to a file that is in stage (before the commit)
git reset --hard HEAD~1
# Discards the last commit made in the local repository (only the last commit)
git commit --amend
# Creates a new commit, replacing the last commit made in the local repository
git revert [commit-hash]
# Creates a new commit, reverting the changes of the specified commit
7. Knowing branches
A branch is nothing more than a ramification of the repository, and so far all actions have been performed on the branch master/main'
.
By default, the first branch created in the repository is the
master/main
, which is the main branch of the repository.
7.1 Why use branches?
It may not seem like much at first, but branches give enormous power to the project development.
Imagine we're developing a web platform, and we want to test a new feature, but our repository is already hosted or shared with other people, and any problematic change could cause a bad experience for them. What can we do?
If you've been thinking about copy and paste the project folder, creating a new "test version", you're right! Well, almost...
With GIT, we can do something similar with branches. Since they are branches, we can simply create a new branch called "test", and thus have a version of our project in a completely isolated branch, ready to be flipped without risking the main branch.
7.2 Creating branches (git branch)
Creating a branch means creating a parallel copy of the repository that can be worked on independently, without affecting the master/main
branch. To do this, we can simply run the command below:
git branch [branch-name]
Running the
git branch
command without a specific branch name must display the list of available branches in the repository, with a "*" marking the branch that's currently in use.
Before running the git branch test
command, the git branch
command only returned the master
branch.
After creating a new branch, we can run the command below to switch between the available branches:
git checkout [branch-name]
After running the git checkout test
command we can see that the active branch is switched. From that moment on, all committed information will be sent to the test
branch of the repository, without affecting the branch master/main
.
It's possible to create as many branches as we need, and we can interact with the existing branches using the commands below:
git checkout -b [branch-name]
-> creates a new branch with the given name and directly switches to it.
git branch -d [branch-name]
-> deletes the specified branch.
git branch -m [new-name]
-> changes the name of the current branch to the given name.
7.3 Merging branches (git merge)
When finished working on a different branch, and we're sure that the changes we've made haven't caused any problems in the project, we can merge the current branch in the master/main
branch, applying all the changes from the current branch to the main branch of the repository.
To merge branches, we need to switch to the branch that will receive the changes and run the following command:
git merge [branch-name]
# Merge the given branch into the active branch
Here, since we are on the branch test
, we should switch to the branch master
using the git checkout
command, and then run the git merge
command with the name of the branch we want to merge ("test", in this case).
By doing this, all the work done on the branch test
(in this case, the creation of the style.css
file) will be merged in the branch master
.
7.4 Merge conflicts
Merging different branches with git merge
can lead to some conflicts in cases where one or more files have been changed on the same lines and the merge cannot be done automatically.
When this happens, we can run the git status
command to check which files are in conflict.
We'll need to solve the conflicts before proceeding with the merge, either by defining which changes should take place, or by reviewing the changes so that they are mutually compatible. To do this, GIT will insert markers into the conflicting files to help with the resolution.
After solving the conflicts, we just need to put the modified files back on stage, commit the new no-conflict versions, and run the git merge
command again, which must successfully merge without any problems.
8. Syncing with the remote repository
We already know that it's possible to connect our local repository to a remote repository and synchronize all our work remotely, keeping it up to date.
To do this we'll need to run the git push
command, which sends all commits from the local repository to the remote repository, but first we need to **configure a remote repository.
8.1 Configuring the remote repository
Starting a remote repository is quite simple. Here we'll use GitHub to do it.
First, we need to start a new empty repository in our GitHub account (just by choosing a name and clicking "Create repository"):
Next, we need to configure the relationship between the remote repository and the local repository by running the following command inside our local repository:
git remote add origin [remote-repository-url]
git remote -v
-> shows the URL of the remote repository that's actually connected to the local repository.
With the remote repository properly connected, we need to change the name of our local branch master/main
to "main" with the command git branch -m main
(ignore this step if your local branch is already called main
):
It's important to keep the main branch of the local repository with the same name as the main branch of the remote repository to which we are pushing.
Finally, after completing the above steps, we can sync our local repository with the remote repository for the first time using the following command:
git push -u origin main
When we run the git push -u origin main
command, we may need to enter our GitHub credentials (user and access token).
If you don't know what a GitHub access token is, or you don't have one access token set up, click here.
We can also work around this by configuring authentication using the GitHub CLI. Find out how by clicking here.
After authenticating, git push
should run successfully, synchronizing all commits in the local repository with the remote repository.
8.2 Git push after the first time (git push)
After going through all the above steps, new syncs can be done using the git push
command alone, without any additional parameters, like shown below.
In this case, the authentication needed to run the command
git push
was bypassed using the GitHub CLI. You can find out how by clicking here.
8.3 Updating the local repository (git pull)
With a distributed remote repository, it's possible for changes to be made remotely (directly in the remote repository), causing our local repository to become outdated.
Thinking about that, it's very important to update the local repository and sync any changes that we got in the remote repository, ensuring that the local project is always with the latest version available in the remote repository. To do this we can run the following command:
git pull
Imagine that a new file README.md
has been created directly in our remote repository and because of that our local repository is now outdated.
Within the local repository we can synchronize the changes from the remote repository using git pull
as mentioned above.
The first 7 lines returned when we run the
git pull
command are the return of thegit fetch
command. In other words, if we run thegit pull
command without first running thegit fetch
command, GIT will run both together to retrieve the updates from the remote repository and synchronize them to the local repository.
git fetch
-> fetch updates from the remote repository, but does not sync the local repository (requiresgit pull
).
9. Conclusion
All this leads us to the certainty that GIT is a version control system that is necessary in the daily life of a programmer, and knowing its main commands and uses can be the turning point in our technical seniority. Finally, with the local and remote repositories synced and updated and with everything we've learned so far, we're ready to move forward with the practicality of this awesome version control system.