Git: a brief introduction
Randal L. Schwartz, [email protected] 4.0.6 on 5 Jan 2012
This document is copyright 2011, 2012 by Randal L. Schwartz, Stonehenge Consulting Services, Inc.This work is licensed under Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License
http://creativecommons.org/licenses/by-nc-sa/3.0/
1Monday, February 6, 12
About me• Been tracking git since it was created• Used git on small projects• Used other systems on small and large projects• Read a lot of people talk about git on the
mailing list• Provided some patches to git, and suggestions
for user interface changes• Worked on small and medium teams with git• But not large ones
2Monday, February 6, 12
What is git?• Git manages changes to a tree of files over time• Git is optimized for:• Distributed development• Large file counts• Complex merges• Making trial branches• Being very fast• Being robust
3Monday, February 6, 12
But not for...
• Tracking file permissions and ownership• Tracking individual files with separate history• Making things painful
4Monday, February 6, 12
Why git?
• Essential to Linux kernel development• Created as a replacement when BitKeeper
suddenly became “unavailable”• Now used by thousands of projects• Everybody has a “commit bit”
5Monday, February 6, 12
Everyone can...
• Clone the tree• Make and test local changes• Submit the changes as patches via mail• OR submit them as a published repository• Track the upstream to revise if needed
6Monday, February 6, 12
How does git do it?• Universal public identifiers• None of the SVK “my @245 is your @992”
• Multi-protocol transport: HTTP, SSH, GIT• Efficient object storage• Everyone has entire repo (disk is cheap)
• Easy branching and merging• Common ancestors are computable
• Patches (and repo updates) can be transported or mailed
• Binary “patches” are supported
7Monday, February 6, 12
The SHA1 is King• Every “object” has a SHA1 to uniquely identify it• “objects” consist of:• Blobs (the contents of a file)• Trees (directories of blobs or other trees)• Commits:• A tree• Plus zero or more parent commits• Plus a message about why
• And tags
8Monday, February 6, 12
Tags
• An object (usually a commit)• Plus an optional subject (if anything else is given)• Plus an optional payload to sign it off• Plus an optional gpg signature• Designed to be immobile• Changes not tracked during cloning• Use a branch if you want to move around
9Monday, February 6, 12
Objects live in the repo
• Git efficiently creates new objects• Objects are generally added, not destroyed• Unreferenced objects will garbage collect• Objects start “loose”, but can be “packed”• “Packs” represent objects as deltas• “Packs” are also created for repo transfer
10Monday, February 6, 12
Commits rule the repo
• One or more commits form the head of object chains
• Typically one head called “master”• Others can be made at will (“branches”)• Usually one commit in the repo that has no
parent commit (“root” commit)
11Monday, February 6, 12
Reaching out• From a commit, reaching the components:• Chase down the tree object to get to
directories and files as they existed at this commit time
• Chase down the parent objects to get to earlier commits and their respective trees
• Do this recursively, and you have all of history• And the SHA1 depends on all of that!
12Monday, February 6, 12
The git repo
• A “working tree” has a “.git” dir at the top level• Unlike CVS, SVN: no pollution of deeper
directories• This makes it friendly to recursive greps
13Monday, February 6, 12
The .git dir contains:• config – Configuration file (.ini style)• objects/* – The object repository• refs/heads/* – branches (like “master”)• refs/tags/* - tags• logs/* - logs• refs/remotes/* - tracking others• index – the “index cache” (described shortly)• HEAD – points to one of the branches (the
“current branch”, where commits go)
14Monday, February 6, 12
The index (or “cache”)• A directory of blob objects• Represents the “next commit”• “Add files” to put current contents in• “Commit” takes the current index and makes it
a real commit object• Diff between HEAD and index:• changed things not yet committed
• Diff between index and working dir:• changed things not yet added• untracked things
15Monday, February 6, 12
What’s in a name?
• Git doesn’t record explicit renaming• Nor expect you to declare it• Exact renaming determined by SHA1• Copy-paste-edits detected by similarity• Computer better than you at that• Explicit tracking will be wrong sometimes• Being wrong breaks merges
16Monday, February 6, 12
Git speaks and listens• Many protocols to transfer between repos• rsync, http, https, git, ssh, local files
• In the core, git also has:• import/export with CVS, SVN
• I use CVS/SVN import to have entire history of a project at 30K feet
• Third party solutions handle others• Git core also includes cvs-server• A git repository can act like a CVS repository
for legacy clients or humans
17Monday, February 6, 12
Getting git• Get the latest “git-*.tar.gz” from
code.google.com/p/git-core• RPMs and Debian packages also exist• Track the git-developer archive:• git clone git://git.kernel.org/pub/scm/git/git.git
• Maintenance releases are very stable• I install mine “prefix=/opt/git”• add /opt/git/bin to PATH
18Monday, February 6, 12
Git commands• All git commands start with “git”• “git MUMBLE-FOO bar” has also been written
as “git-MUMBLE-FOO bar”• This allows a single entry “git” to be added to
the /usr/local/bin path• This works for internal calls as well• Manpages are still under “git-MUMBLE-FOO”• Unless you use “git help MUMBLE-FOO”• Or “git MUMBLE-FOO --help”
19Monday, February 6, 12
Porcelain and plumbing• Low-level git operations are called “plumbing”• Higher level actions are called “porcelain”• The git distro includes both• Use porcelain from command line• But don’t script with it• Future releases might change things
• Use plumbing for scripts• Intended to be upward compatible
20Monday, February 6, 12
Creating a repo• git init• Creates a .git in the current dir• Optional: edit .gitignore• “git add .” to add all files (except .git!)• Then “git commit” for the initial commit• Creates current branch named “master”
• Could also do this on a tarball• tar xvfz some-tarball.tgz; cd some-tarball• git init• git add .
21Monday, February 6, 12
Cloning• Creates a git repo from an existing repo• Generally creates a subdirectory• Your workfiles and .git are in there• Remote branches are “tracked”• Remote “HEAD” branch checked out as your
initial “master” branch as well• Clone repo identified as “origin”• But the name is otherwise unspecial
22Monday, February 6, 12
Committing• Your work product is more commits• These are always on a “branch”• A branch is just a named commit• When you commit, the former branch head
becomes the parent• The branch head moves to be the new commit• Thus, you’re creating a directed acyclic graph• ... rooted in branch heads
• A merge is just a commit with multiple parents
23Monday, February 6, 12
Typical work flow
• Edit edit edit• git add files/you/have changed/now• This adds the files to the index• “git add .” for adding all interesting files
• git status• Tells you differences between HEAD, index,
and working directory
24Monday, February 6, 12
Making the commit
• “git commit”• Popped into a text editor (or “-m msg”)• First text line used for “short logs”• Current branch is moved forward• And you’re back to more editing
25Monday, February 6, 12
But which branch?• Git encourages branching• A branch is just 41 text bytes!
• Typical work flow:• Think of something to do• git checkout -b topic-name master• work work work, commit to topic-name
• When your thing is done:• git checkout master• git merge topic-name• git branch -d topic-name
26Monday, February 6, 12
Working in parallel• You can have multiple topics active:• git checkout -b topic1 master• work work; commit; work work; commit• git checkout -b topic2 master• work work work; commit• git checkout topic1; work work; commit
• Decide how to bring them together• Merge: parallel histories• Rebase: serial histories• Each has pros and cons
27Monday, February 6, 12
The merge• git checkout master• git merge topic1; git branch -d topic1• This should be trivial (“fast forward”) merge
• git merge topic2• Conflicts may arise:• overlapping changes in text edits• files renamed two different ways
• You need to resolve, and continue:• git commit -a (describe the merge fix here)
28Monday, February 6, 12
The rebase• Rewrites commits• Breaks SHA1s: commits are lost!• Don’t rebase if you’ve published commits!
• git checkout topic2; git rebase master• topic2’s commits rewritten on top of master
• May result in merge conflicts:• git rebase --continue or --abort or --skip
• git rebase -i (interactive) is helpful• When rebased, merge is a fast forward:• git checkout master; git merge topic2
29Monday, February 6, 12
Read the history• git log• print the changes
• git log -p• print the changes, including a diff between
revisions• git log --stat• Summarize the changes with a diffstat
• git log -- file1 file2 dir3• Show changes only for listed files or subdirs
30Monday, February 6, 12
What’s the difference?• git diff• Diff between index and working tree• These are things you should “git add”• “git commit -a” will also make this list empty
• git diff HEAD• Difference between HEAD and working tree• “git commit -a” will make this empty
• git diff --cached• between HEAD and index• “git commit” (without -a) makes this empty
31Monday, February 6, 12
Other diffs• git diff OTHERBRANCH• Other branch and working tree
• git diff BRANCH1 BRANCH2• Difference between two branch heads
• git diff BRANCH1...BRANCH2• changes only on branch2 relative to common
• git diff --stat (other options)• Nice summary of changes
• git diff --dirstat (other options)• Summarize directory changes
32Monday, February 6, 12
Barking up the tree• Most commands take “tree-ish” args• SHA1 picks something absolutely• Can be abbreviated if not ambiguous
• HEAD, some-branch-name, some-tag-name, some-origin-name• Optionally followed by @{historical}
• “historical” can be:• yesterday, 2011-11-22, etc (date ref)• 1, 2, 3, etc (prior version of this ref)• “upstream” (upstream version of local)
33Monday, February 6, 12
Meet the parents
• Any of those on the prior slide, followed by:• ^n - “the n-th parent of an item” (default 1)• ~n - n ^1’s (so ~3 is ^1^1^1)• :path - pick the object from the tree
34Monday, February 6, 12
Tree Examples
• git diff HEAD^ HEAD• most recent change on current branch• Also: git diff HEAD~ HEAD
• git diff HEAD~3 HEAD• What damage did last three edits do?
35Monday, February 6, 12
Seeing the changes• gitk mytopic origin• Tk widget display of history• Shows changes back to common ancestor
• gitk --all• show everything
• gitk from..to• Just the changes in “to” that aren’t in “from”
• git show-branch from..to• Same thing for the Tk-challenged
36Monday, February 6, 12
Playing well with others• git clone creates “tracking” branches• Typically named “origin/master” etc• To share your work, first get up to date:• git fetch origin
• Now rebase your changes on upstream:• git rebase origin/master
• Or fetch/rebase in one step• git pull --rebase
• To push upstream:• git push
37Monday, February 6, 12
Resetting• git reset --soft• Makes all files “updated but not checked in”
• git reset --hard # DANGER• Forces working dir to look like last commit
• git reset --hard HEAD~3• Tosses most recent 3 commits• use “git revert” instead if you’ve published
• git checkout HEAD some/lost/file• Recover the version of some/lost/file from
the last commit
38Monday, February 6, 12
Ignoring things• Every directory can contain a .gitignore• lines starting with “!” mean “not”• lines without “/” are checked against
basename• otherwise, shell glob via fnmatch(3)• Leading / means “the current directory”
• Checked into the repository and tracked• Every repository can contain a .git/info/exclude• Both of these work together• But .git/info/exclude won’t be cloned
39Monday, February 6, 12
Configuration• Many commands have configurations• git config name value• set name to value• name can contain periods for sub-items
• git config name• get current value
• git config --global name [value]• Same, but with ~/.gitconfig• This applies to all git repos from a user
40Monday, February 6, 12
The stash• Creates temporary commits to represent:• current index (git add ...)• current working directory (git add .)
• Can rebase those onto new index later• Many uses, such as pull into dirty workdir:• git stash; git pull ...; git stash pop• Might result in conflicts, of course
• Multiple stashes can be in play• “git stash list” to show them
41Monday, February 6, 12
Other useful porcelain• git archive: export a tree as a tar/zip• git bisect: find the offensive commit• git cherry-pick: selective merging• git mv: rename a file/dir with the right index
manipulations• git rm: ditto for delete• git push: write to an upstream• git revert: add a commit that undoes a previous
commit• git blame: who wrote this?
42Monday, February 6, 12
Commit Advice
• Split changes into small logical steps• Ideally ones that pass the test suite again
• This helps for “blame” and “bisect”.• Easier to squash commits later than to break up• “git rebase -i” can squash, omit, reorder
43Monday, February 6, 12
Picking from branches
• Two main tools: “merge” and “cherry-pick”• Merge brings in all commits• Scales well for large workflows
• Cherry-pick brings in one or more• Great when a single patch is needed
44Monday, February 6, 12
git.git’s workflow• Four branches:• maint: fixes to existing releases• master: next release• next: testing for next master• pu: experimental features
• Each one is a descendent of the one above• Commit to the oldest branch needing patch• Then merge it upward:• maint to master to next to pu
45Monday, February 6, 12
Topic branches• Most features require several iterations• Commit these to topic branches during design• Easier to rehack or abandon this way
• Fork topic from the oldest main branch• Refresh-merge from that branch if needed• But don’t do that routinely
• Rebase topic branch if forked from wrong branch
• More details at “man 7 gitworkflows”
46Monday, February 6, 12
Testing integration• Merge from base branch to topic branch• ... on a new throw-away branch
• This branch is never merged back in• Just for testing
• Can be published publicly, if you make that clear• Otherwise, typically used only locally
• If integration fails, fix, and cherry-pick those back to the topic branch before final merge
47Monday, February 6, 12
Time to “git” dirty• Make a git repository:• mkdir git-tutorial• cd git-tutorial• git init• git config user.name “Randal Schwartz”• git config user.email [email protected]
• Add some content:• echo "Hello World" >hello• echo "Silly example" >example
48Monday, February 6, 12
What’s up?
• git status• git add example hello• git status• git diff --cached
49Monday, February 6, 12
“git add” timing• Change the content of “hello”• echo "It's a new day for git" >>hello• git status• git diff
• Now commit the index (with old hello)• git commit -m initial• git status• git diff• git diff HEAD
50Monday, February 6, 12
git commit -a
• Note that we committed the version of “hello” at the time we added it!
• Fix this by adding -a nearly always:• git commit -a -m update• git status
51Monday, February 6, 12
What happened?• Ask for logs:• git log• git log -p• git log --stat --summary
• Tag, you’re it:• git tag my-first-tag
• Now we can always get back to that version later
52Monday, February 6, 12
Sharing the work• Create the clone:• cd ..• git clone git-tutorial my-git• cd my-git
• The git clone will often have some sort of transport path, like git: or rsync: or http:
• See what we’ve got:• git log -p
• Note that we have the entire history• And that the SHA1s are identical
53Monday, February 6, 12
Branching out
• Create branch “my-branch”• git checkout -b my-branch• git status
• Make some changes:• echo "Work, work, work" >>hello• git commit -a -m 'Some work.'
54Monday, February 6, 12
Conflicts
• Switch back, and make other changes:• git checkout master• echo "Play, play, play" >>hello• echo "Lots of fun" >>example• git commit -a -m 'Some fun.'
• We now have conflicting commits
55Monday, February 6, 12
Seeing the damage• In an X11 display:• gitk --all
• The --all means “all heads, branches, tags”• For the X11 challenged:• git show-branch --all• git log --pretty=oneline --abbrev-commit \
--graph --decorate --all• Handy for a mail message
56Monday, February 6, 12
Merging• We’re on “master”, and we want to merge in
the changes from my-branch• Select the merge:• git merge my-branch
• This fails, because we have a conflict in “hello”• See this with:• git status
• Edit “hello”, and commit:• git commit -a -m “Merge work in my-branch”
57Monday, February 6, 12
Did it work?• Verify the merge with:• gitk --all• git show-branch --all
• See changes back to the common ancestor:• gitk master my-branch• git show-branch master my-branch
• Note that master is only one edit from my-branch now (the merge patch-up)
• “git show” handy with merges:• git show HEAD
58Monday, February 6, 12
Merging the upstream• Master is now updated with my-branch changes• But my-branch is now lagging• We can merge back the other way:• git checkout my-branch• git merge master
• This will succeed as a “fast forward”• This means that the merge-from branch already
has all of our change history• So it’s just adding linear history to the end
59Monday, February 6, 12
Upstream changes• Let’s change origin a bit• cd ../git-tutorial• echo "some upstream change" >>other• git add other• git commit -a -m "upstream change"
• And now fetch it downstream• cd ../my-git• git fetch• gitk --all• git diff master..origin/master
60Monday, February 6, 12
Merge it in• Explicit merging• git checkout master• git merge origin/master
• Implicit fetch/merge• git pull
• Eliminating the bushy tree• git pull --rebase• (Fails in our example.. sigh.)
61Monday, February 6, 12
Splitting up a patch
• Sometimes, your changes are logically separate• echo “this change” >>hello• echo “unrelated change” >>example
• Now make two commits:• git add -p # interactively select hello change• git commit -m “fixed hello” # not -a!• git commit -a -m “fixed example”
62Monday, February 6, 12
Fixing a commit
• Oops, left out something on that last one• echo "another unrelated" >>example
• Now “amend” the patch:• git commit -a --amend
• This replaces the commit• Be careful that you haven’t pushed it!
63Monday, February 6, 12
For further info• See “Git (software)” in Wikipedia• And the git homepage http://git-scm.com/• Git wiki at https://git.wiki.kernel.org/• Wonderful Pro Git book: http://progit.org/book/• Get on the mailing list• Helpful people there• You can submit bugs, patches, ideas
• And the #git IRC channel (on Freenode)• Now “git” to it!
64Monday, February 6, 12