- Rahul Neelakantan
Git is a version control system, which tracks the changes you make to files, so it keeps a record of what you’ve done, and helps you to revert to a specific version should you ever need it. It also makes collaboration more manageable, allowing multiple people to work on a single file/folder without the need to worry about overwriting someone’s work.
It’s an actively supported open-source project by far the most widely used version control system. Here is a link to the Git website, where you can download it & experiment with it. Git is supported in all platforms, Windows/Mac/Linux, etc.
Table of Contents
- Why is Git Needed?
- Can I use Git for everything?
- How does Git works?
- Alternatives to Git
- Microsoft Team Foundation Server
- Features of Git
- Cross Platform
- Non-linear development
- Open Source
- Why is Git so famous?
- Popular websites
- Fast branching and merging
- Distributed version control system
- History of Git and why it was built?
Why is Git Needed?
Let’s take a situation where you’re planning to update some code in a project. You’re worried and don’t want to mess something up. Due to that, you create backup files. If you’re working on that project for a few days, you create multiple backups. You know it isn’t easy to maintain a good naming convention for backup files and keep it tagged etc. Also, if you take full backups of your project, it can grow gigantic. So, it’s tough to maintain backups.
If you ever thought of working with your teammates/friends on the same project, but what’s holding you back? How can they edit the same file/folder you’re working on and, worst, the same line? How do you keep these changes in sync?
Git can solve all these troubles. It will only keep a backup of changed files in its history, so it is very efficient & small but makes it effortless to restore your project to previous versions.
Git can also help with collaboration. There is no limit to how many people can work in parallel. That is why open-source software can compete with large organizations because thousands of developers are working on a single open-source project like Linux Kernel, etc.
You’ll need to understand some more advanced concepts to know how Git is doing these, but it’s not the topic for this post.
Can I use Git for everything?
Git performs best when files are stored in text format, i.e., programming files, TXT files, etc. As text files are small, Even if you hold thousands of text lines, the size of that file will not exceed more than a few KBs. This way, backup of files don’t need to take more space. Git uses other advanced compression techniques, which will reduce file size further.
If you’re working on documents, images, pdf, etc., git might not function efficiently. There is something called Git LFS, but if you’re making frequent changes to your binary files, it won’t help you much. The main reason is binary files are enormous, i.e., they can range in MB/GB, so keeping track of these files can make your Git bloat rapidly.
There are so many advantages to Git. Documentation majorly uses Markdown language, which is similar to a text file but with an
MD extension, which is usable with Git versioning. That’s why almost all documentation has moved to Markdown.
How does Git works?
Say you have a project stored in a specific folder location. If you want Git to track changes in that folder, you have to move to that location and initialize Git. There are multiple ways to initialize it, and you can use VS Code, etc. The most used method is to type in Git bash as git init, which will create a .git folder in that project directory, it stores all of its internal files in this folder. So, it’s better if we don’t touch it unless we have a strong reason to do so.
As this is a new repository, all the files are untracked, so you’ll need to stage the files & write your commit message to create your first commit.
.git folder stores distorted object files, which are a representation of your working directory. The folder structure is quite complex, so you can’t directly identify your files. Sometimes if there are many object files, then Git packs all these objects into a single file.
Now Git can track files in your folder if you make any change in a file, then when you type “git status,” you can see that Git knows that you’ve changed that file.
Alternatives to Git
There are many alternatives to Git but nothing beats it in popularity and easy of use.
Microsoft Team Foundation Server
TFS and Git were created at almost the same time. It uses a centralized version control system, where the working copy would be with developers, whereas the server machine would have the backups & versions. TFS is a private version control system, which only Microsoft uses.
It has two workflow models
- Server Workspaces – We need to be always connected to work with files.
- Local Workspaces – We can download a copy of the latest files & resolve conflicts as necessary.
But now, this service has migrated to Azure DevOps, which prefers Git over TFS.
Many companies had used TFS, which Microsoft provided as a software service to support Agile project management.
CollabNet created Subversion in 2000 as a part of Apache software foundation, which is open-source software and a centralized version control system.
It has a distributed architecture similar to Git. Mercurial is extensible in python. For more information, you can find out from this site.
Dick Grune originally developed concurrent Versions System in 1986. It operates as a front end to RCS (Revision Control System), an earlier system that works on single files. This is free software. Subversion was based on CVS.
Development of CVS is halted, i.e. it’s not much actively developed after 2008 release.
Fossil is also similar to Git and is being actively developed, and it has some hosting features like Wiki, ticketing & bug tracking, and a web forum. Mostly useful for developers to self-host projects, like SQLite, etc.
Features of Git
Git is supported in all platforms, i.e., Windows/Mac/Linux, etc. For example, even if the project is developed in Windows, if there are cross-platform toolchains in Mac, this person can contribute to the project.
Though it is cross-platform, we’ll have to be a little careful with naming conventions of files & folders. As Linux & Mac are both case-sensitive to file names, Windows has troubles with deeply nested directories. You can access this document to understand it.
Unlike other version control systems, we don’t have to be constantly connected to the server system. Also, it supports rapid branching & merging. We can create a new branch for every feature, and once testing is done, we can merge it to the main branch. Multiple parallel branches can be created and merged at any time. This feature benefits non-linear real-world development.
You can create branches locally or globally also. Creating a branch is very cost-effective compared to other SCM tools. When you create a branch then internally it creates an additional file in which the commit id stored for that branch, so there is almost no change in repository size.
A new repository is only 36.0 KB, so it is very lightweight in nature. It is built for high performance. Operations like branching, committing, and scanning are quick, even for repositories of more than 100 MB.
Git is open-source software. It was primarily built to support Linux kernel development, the largest open-source project. It is actively maintained and developed, and all large organizations are using it in some way or the other.
Microsoft now owns the largest open-source code hosting platform which uses Git internally.
Git is very secure uses SHA1 hash algorithm to generate commit ids, etc. The commit ids are created similar to the blockchain where it uses the parent SHA code, along with other information like author info, commit message, etc. to generate the SHA1 code for commit.
Why is Git so famous?
GitHub is so famous, whenever you say Git, people say GitHub. GitHub is a code hosting platform for version control and collaboration. It has over 56 million developers that shape the future of software together, contribute to open-source projects, manage their repositories, etc.
GitHub is a Git repository hosting service, but it adds many of its features like Forking, LFS etc.
Fast branching and merging
Branching & Merging are more complex concepts, which I’ll explain in other posts. For now, take a small example where we need to maintain one folder for production and the other for experimentation, etc. We don’t introduce bugs, so for every new experiment, you’d have to repeat this process that too in parallel. Once your experimentation is successful, you’ll have to sync it manually. Now you understand why it’s an excruciating process if you’d to do it manually.
In Git, creating a branch is as simple as giving it a name. We can create thousands of branches. Still, there won’t be any performance hit. Merging branches is also automatic. We’ll have to review the codebase once the merge has happened.
Distributed version control system
Git allows you to work with others seamlessly, and you don’t have to worry about merging their changes to your files manually. The main plus point of Git is we hold the entire repository with us locally, so we don’t always need to be online. We can push the changes whenever we need to.
Many other version control systems like SVN and others use the central repository system that means we have to be constantly connected to the central repository to do anything related to history.
History of Git and why it was built?
Git was created by Linus Torvalds in 2005 for the development of the Linux kernel, with other kernel developers contributing to its initial development.
In 2002, the Linux kernel project began using a proprietary DVCS called BitKeeper. In 2005, the relationship between the community that developed the Linux kernel and the commercial company that developed BitKeeper broke down. So, Linus had created Git to support his main project, Linux kernel. Since its birth in 2005, It has evolved and matured to be easy to use and retain these initial qualities. It’s swift, it’s very efficient with large projects, and it has an incredible branching system for non-linear development.
Now that you understand what Git is and why you must use it. Please install Git and use it as your version control system instead of manually keeping track of backups.
If you want to learn more about it, you can subscribe to this blog, where I’ll be explaining more advanced concepts of Git.