From Documentation
Revision as of 14:39, 10 November 2014 by Edward (Talk | contribs) (just making the article current)

Jump to: navigation, search
Description: Version control tool
SHARCNET Package information: see GIT software page in web portal
Full list of SHARCNET supported software


Git was created by Linus Torvald to manage the Linux Kernel code after BitKeeper withdrew permission for the kernel developers to use their proprietary system free of charge.

Linus specifically designed git to not be like CVS (which he hated from his time at Transmeta), to support a distributed workflow, provide very strong safeguards against accidental or malicious corruption, and to have very high performance. There later features have resulted in it becoming very popular in the open source world.

Further general information can be found on the git wikipedia page. The why git is better than x website gives an overview of the various advantages of git relative to other revision control systems.

Basic workflow

At any one time, while working with a git repository, there are three stages work can be saved under (see the gitglossary man page for further definitions):

  • working tree - the files that are in the current directory,
  • index - a stored snapshot of the working tree, and
  • branch - active lines of development.

It is useful to understand this in terms of what is happening behind the scene. A git repository is actually just a collection of objects, that is, chunks of data, identified by part of their SHA1 hashes. The types of objects are

  • blobs - a non-specific object (e.g., the contents of a file),
  • tree - a collection of blobs and tree references (i.e., a directory),
  • commit - information about a specific revision, including a commit message, references to parent commits, and an associated tree reference, and
  • tag - a reference object with a tag message.

The various branches are then just references to commit objects, which chain together in a directed acyclic graph to produce a history. The index is a tree object. Committing the index creates a new commit object that records the commit message, the current index tree, and current commit object as the parent. The current branch reference is then updated to reference the new commit object.


Working locally with git consists of repetitively working with the files in the working tree, adding them to the index at some point, and then committing the index to the active branch at some further point. Possibly it is easiest to think of the index as the staging area. Changes/updates are accumulated in it (git add <filename>) until some reasonable state of development is reached, and then they are committed to the current development branch (git commit).

At any point in time, it is possible to fork off an new branch (git branch <commit> to create it or git checkout -b <commit> to create it and switch to it) or switch to an entirely different development branch (git checkout <commit>), the later of which will update the index and the working tree appropriately. Much of the power of git comes from being able to simultaneously work with several different active lines of development at once.

Different branches can be brought back together in various ways. It is possible to take the changes from a single commit from anywhere and apply it to the current branch (git cherry-pick <commit>). It is also possible to apply the changes made in another branch to the current branch. This can be done by applying them on top of the current work (git merge <commit>) or underneath it (git rebase <commit>).

Detailed information about all the changes made is available via the logs (git log <commit>).


Git is different than revision control software such as SVN in that it does not revolve around a central server. Each git repository is entirely self contained and technically equivalent to all other git repositories. Distributed work relies on a sharing model of swapping changes (commits) back and forth.

While this can be done entirely by email (git format-patch <revision range> to export and git am <message> to import), this is usually only the case for people submitting patches to large open source projects. In a fully trusted situation, it is usually easier to directly import (git fetch <remote> or git pull <remote> to fetch and merge) and exported (git push <remote>) changes to other repositories.

Although all git repositories are technically equal, it is common for one to be chosen as a stable reference repository, with developers choosing to only export complete and tested changes to it from their personal repositories.

Local work

The following provides some bare basic commands to get you going. It is strongly recommended to look these commands up in their man pages (man git-<command> or git <command> --help) as there is a host of useful options and alternative ways they can be run.

Turning on color for various outputs is also recommended. This can be done on a global level (i.e., store the option in ~/.gitconfig instead of the project specific .git/config) by the commands

git config --global color.diff auto
git config --global color.status auto
git config --global color.branch auto
git config --global color.interactive auto

Creating a new project

To create a new git project, simply do

mkdir <directory>
cd <directory>
git init

The --shared option can also be specified to setup repository for sharing with other SHARCNET users. See further down for more details.

Viewing changes

For summary information about what has changed (that is differences that exist between the current branch, the index, and the working tree), run

git status

For more detailed information about what has changed in a particular file run

git diff <filename>

The history of changes can be viewed by

git log

A useful alias for a fancy version of this last command is

git config --global 'log --graph --decorate --pretty=oneline --abbrev-commit'

Saving changes

First add the changes to the index via

git add <filename>

The commit the changes by running

git commit

Files can be deleted and moved by git rm and git mv.

Remote work

The following provides some bare basic commands to get you going. It is strongly recommended to look these commands up in their man pages (man git-<command> or git <command> --help) as there is a host of useful options and alternative ways they can be run.

This list of remote repositories git knows about can be obtained via

git remote -a -v

Duplicating an existing project

To make a personal copy of an existing repository do

git clone <url>
cd <path>
git config remote.origin.push refs/heads/master:refs/heads/master

This adds a default remote origin for pulling future changes from. The last two commands set the system up so git push origin will default to sending any changes made to the local master branch to the remote master branch.

Adding remotes to an existing project

Remotes can be added to an existing repository by running

git remote add <remote> <url>
git config remote.<remote>.push refs/heads/master:refs/heads/master
git config branch.master.remote <remote>
git config branch.master.merge refs/heads/master

The second line sets the system up so git push <remote> will default to sending any changes made to the local master branch to the remote master branch. The third line sets the default remote to the created remote. The fourth line sets master as the default branch that git pull will merge into the local master branch.

Pushing and pulling

Assuming the default refspec has been setup, commits can be sent to the remote by running

git push <remote>

Similarly, commits can be brought down from the remote by running

git pull <remote>

Before attempting to send the local changes with the former command, it is usually a good idea to the later command with the --rebase option to incorporate any existing changes present in the remote repository under and the local changes (resolving any issues that arise).

It is also possible to just bring the remote changes into the remote tracking branch (run git branch -a -v to see these) without merging for inspection via

git fetch <remote>

All of these commands use refsepcs, which correspond to directories under .git, to specify the movement between local and remote branches. At some point, it is recommend to read the information about this in the git-push, git-fetch, or git-pull man pages in order to understanding the + option and fast forwarding.

Sharing Repositories

All users with a common sponsor belong to the same sponsor group. This gives them read only access to each others directories by default, which makes it possible to for each of them to pull changes from the others. For example,

git remote add <remote> /home/<other user>/<git dir>
git fetch <remote>

will add other user's git repo as a remote repository to the local one and fetch it. For access off of SHARCNET's clusters, just make the remote a ssh directory

 git remote add <remote> <user>@<cluster>:/home/<other user>/<git dir>

where user is your SHARCNET username and cluster is any of the SHARCNET clusters.

Cluster Repositories

The --shared=<umask> option can be used with git init to override the default permissions, where umask is a standard UNIX 0xxx umask specifications or the keywords group (group writable) or all (world readable).

This can be used to create a master repository that all members of the group can write to is an effective way to create a master repository. For example, the supervisor can do

 git init --bare --shared=group

in some master dir under their home directory (so it is backed up), where the --bare option just specified that a working tree is not required. An existing repository can then be pushed into it via

 git remote add master /home/<supervisor>/<master dir>
 git push master

New members to the group can then get their own copy to work with by just doing

 git clone /home/<supervisor>/<master dir>

For access off of SHARCNET clusters, /home/<supervisor>/<master dir> would have to be replaced with <user>@<cluster>/home/<supervisor>/<master dir> as before.

Sponsor group equivalent access can also be given to other specific groups and/or users by using FACLs.

 setfacl -R -n -m default:user:<user>:rwx -m user:<user>:rwx  /home/<supervisor>/<master dir>
 setfacl -R -n -m default:group:<group>:rwx -m group:<group>:rwx /home/<supervisor>/<master dir>

The above does not grant full permissions. Because specified user and group permissions are always masked by the mask permissions (what non-facl aware programs call the group permissions), it just means that the named user or group gets up to equivalent access to the sponsored group (i.e., what was specified with --shared=<umask>).

Permissions to the sponsored group can also be revoked so just the named groups and users have access

 setfacl -R -n -m default:group::- -m group::- /home/<supervisor>/<master dir>

Web Repositories

The very popular github and gitorious sites both provide very comprehensive web front ends (including wikis, etc.) for free for public repositories. Private repositories require paying a small fee.

SHARCNET also maintains various public repositories on our website. Any of these can be checked out with

 git clone https://www.sharcnet/git/<project>

where project is the project name give on the website.

Getting Help For Git

Git comes with extensive documentation. Run git <command> --help or man git-<command>. A simple google search will turn up a wealth of online tutorials in addition to the below reference links.


o Github Help Using Git (cheat sheets, tutorials)

o GIT Online Documentation

o GIT Cheat Sheet - Extended Edition

o Download Guis

o Central Repo Howto

o Website Howto