GIT: Content-addressable filesystem and Version Control System

52
GIT Content-addressable filesystem and Version Control System @tommyblue - www.tommyblue.it

description

Git presentation, internals, advanced use and workflow examples. Presentation by Tommaso Visconti http://www.tommyblue.it for DrWolf srl http://www.drwolf.it

Transcript of GIT: Content-addressable filesystem and Version Control System

Page 1: GIT: Content-addressable filesystem and Version Control System

GITContent-addressable filesystem

and Version Control System

@tommyblue - www.tommyblue.it

Page 2: GIT: Content-addressable filesystem and Version Control System

From Nofu to Shogun

• Introduction

• Basic commands

• Advanced commands

• Git internals (optional)

• Introduction

• Git internals

• Basic commands

• Advanced commands

• Fabbri’s problem-solving

Standard presentation This presentation

Pizzu’s

Page 3: GIT: Content-addressable filesystem and Version Control System

GIT

Content-addressable filesystem + Tools to manage them (porcelains) = —————————————————

Version Control System

Page 4: GIT: Content-addressable filesystem and Version Control System

CVS/SVN use deltas

With long histories, becomes heavy to calculate current files state

Page 5: GIT: Content-addressable filesystem and Version Control System

Git uses snapshots

• A version is a tree (like a FS) using hashes as nodes • Every version is a snapshot of the full repository, with all files • A file is identified by a SHA-1 hash, depending on the file content • If the content doesn’t change, the hash is the same

Page 6: GIT: Content-addressable filesystem and Version Control System

Git internalsPlumbing

Page 7: GIT: Content-addressable filesystem and Version Control System

The .git folderEverything is in the .git folder (delete it to delete repo)

.git/hooks/info/objects/ => repo contentrefs/ => commit objects’ pointersconfigdescriptionindex => stage infosHEAD => checkouted branch

Page 8: GIT: Content-addressable filesystem and Version Control System

Git objects

All the object are identified by an hash

Git works like a key-value datastore

When you save an object in Git, it returns its hash

Objects !

Blob Tree Commit TAG

Page 9: GIT: Content-addressable filesystem and Version Control System

Blob objects - 1

Essentially the committed file with its content

# Save a simple text file (without -w it calculates the hash)$ echo 'test content' | git hash-object -w --stdind670460b4b4aece5915caf5c68d12f560a9fe3e4

# The created object (notice the folder structure)$ find .git/objects -type f .git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4

# Extract the content using the hash as reference$ git cat-file -p d670460b4b4aece5915caf5c68d12f560a9fe3e4test content

Page 10: GIT: Content-addressable filesystem and Version Control System

Blob objects - 2# New version of the file$ echo 'version 1' > test.txt$ git hash-object -w test.txt 83baae61804e65cc73a7201a7252750c76066a30

# Now there are two objects$ find .git/objects -type f .git/objects/83/baae61804e65cc73a7201a7252750c76066a30.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4

# Restore the old version$ git cat-file -p d670460b4b4aece5915caf5c68d12f560a9fe3e4 > test.txt

Page 11: GIT: Content-addressable filesystem and Version Control System

Tree objectsContains references to its children (trees or blobs), like a UNIX folder

Every commit (snapshot) has a different tree

# Adds a file to the index (staging area)$ git update-index --add --cacheinfo 100644 \ a906cb2a4a904a152e80877d4088654daad0c859 README

# Show the tree object content$ git cat-file -p 1f7a7a472abf3dd9643fd615f6da379c4acb3e3a100644 blob a906cb2a4a904a152e80877d4088654daad0c859 README100644 blob 8f94139338f9404f26296befa88755fc2598c289 Rakefile040000 tree 99f1a6d12cb4b6f19c8655fca46c3ecf317074e0 lib

# Write the tree object (containing indexed/staged files)$ git write-tree1f7a7a472abf3dd9643fd615f6da379c4acb3e3a$ git cat-file -t 1f7a7a472abf3dd9643fd615f6da379c4acb3e3atree

Page 12: GIT: Content-addressable filesystem and Version Control System

Commit objects - 1Commit message, other informations (GPG) and reference to a

tree object (through its hash)

# First commit object given a tree object (1f7a7a)$ echo 'first commit' | git commit-tree 1f7a7afdf4fc3344e67ab068f836878b6c4951e3b15f3d

# Let’s check the commit content$ git cat-file -p fdf4fc3tree 1f7a7a472abf3dd9643fd615f6da379c4acb3e3aauthor Tommaso Visconti <[email protected]> 1243040974 -0700committer Tommaso Visconti <[email protected]> 1243040974 -0700!first commit

Page 13: GIT: Content-addressable filesystem and Version Control System

TAG objectsLightweight tag

Like a commit with reference to a commit object (not a tree)

Annotated tagGit creates a tag object and the ref tag pointing to its hash

$ git tag -a v1.1 1a410efbd13591db07496601ebc7a059dd55cfe9 -m 'test tag’!$ cat .git/refs/tags/v1.19585191f37f7b0fb9444f35a9bf50de191beadc2!$ git cat-file -p 9585191f37f7b0fb9444f35a9bf50de191beadc2object 1a410efbd13591db07496601ebc7a059dd55cfe9type committag v1.1tagger Tommaso Visconti <[email protected]> Mon Nov 25 15:56:23 2013!test tag

Page 14: GIT: Content-addressable filesystem and Version Control System

References - 1

Useful to avoid using and remembering hashes

Stored in .git/refs/

Text files just containing the commit hash

Page 15: GIT: Content-addressable filesystem and Version Control System

References - 2Branches are references stored in .git/refs/heads

# Update master ref to the last commit$ git update-ref refs/heads/master 1a410e

Lightweight tags are refs placed in .git/refs/tags

Remotes are refs stored in.git/refs/remotes/<REMOTE>/<BRANCH>

!The commit hash of a remote is the commit of that branch

the last time the remote was synchronized (fetch/push)

Page 16: GIT: Content-addressable filesystem and Version Control System

Repository overview

Page 17: GIT: Content-addressable filesystem and Version Control System

How GIT knows which branch or commit is checkouted?

# HEAD is a symbolic reference$ cat .git/HEAD ref: refs/heads/master

# Get HEAD value using the proper tool$ git symbolic-ref HEADrefs/heads/master

# Set HEAD$ git symbolic-ref HEAD refs/heads/test$ cat .git/HEAD ref: refs/heads/test

HEAD - 1

Page 18: GIT: Content-addressable filesystem and Version Control System

HEAD generally points to the checked out branch

Committing in this situation means set a new hash to the current branch and, as a consequence, HEAD points to it

If HEAD points to a commit and not a branch, it’s the so called Detached HEAD

Committing in this situation means advance HEAD to the new commit hash, without modifying any branch

HEAD - 2

HEAD can be moved with git reset

Page 19: GIT: Content-addressable filesystem and Version Control System

Detached HEAD - 1$ git add .$ git commit -m message1[master (root-commit) 3065781] message1 1 file changed, 1 insertion(+) create mode 100644 test.txt

$ cat .git/HEADref: refs/heads/master

$ cat .git/HEADref: refs/heads/master

$ cat .git/refs/heads/master30657817081f6f0808bf37da470973ad12cb7593

$ echo prova > test2.txt$ git add .$ git ci -m test2[master 33676f0] test2 1 file changed, 1 insertion(+) create mode 100644 test2.txt

$ cat .git/refs/heads/master33676f027d9c36c66f2a2d5d74ee1cbf3e1ff56b

Page 20: GIT: Content-addressable filesystem and Version Control System

$ cat .git/HEADref: refs/heads/master

Detached HEAD - 2

$ git checkout 30657817081f6f0808bf37da470973ad12cb7593

Note: checking out '30657817081f6f0808bf37da470973ad12cb7593'.!You are in 'detached HEAD' state. You can look around, make experimentalchanges and commit them, and you can discard any commits you make in thisstate without impacting any branches by performing another checkout.!If you want to create a new branch to retain commits you create, you maydo so (now or later) by using -b with the checkout command again. Example:! git checkout -b new_branch_name!HEAD is now at 3065781... message1

$ cat .git/HEAD30657817081f6f0808bf37da470973ad12cb7593

Page 21: GIT: Content-addressable filesystem and Version Control System

$ echo 'detached' > det.tx$ git add .

$ git commit -m 'det commit'

Detached HEAD - 3

[detached HEAD b843940] det commit 1 file changed, 1 insertion(+) create mode 100644 det.txt

$ git log

commit b8439407b2524b55ceba814a9eef2ee92655a0bbAuthor: Tommaso Visconti <[email protected]>Date: Mon Nov 25 16:49:05 2013 +0100! det commit!commit 30657817081f6f0808bf37da470973ad12cb7593Author: Tommaso Visconti <[email protected]>Date: Mon Nov 25 16:39:53 2013 +0100! message1

Page 22: GIT: Content-addressable filesystem and Version Control System

Detached HEAD - 4$ git checkout master

$ git log

Warning: you are leaving 1 commit behind, not connected toany of your branches:! b843940 det commit!If you want to keep them by creating a new branch, this may be a good timeto do so with:! git branch new_branch_name b843940!Switched to branch 'master'

commit 33676f027d9c36c66f2a2d5d74ee1cbf3e1ff56bAuthor: Tommaso Visconti <[email protected]>Date: Mon Nov 25 16:40:31 2013 +0100! test2!commit 30657817081f6f0808bf37da470973ad12cb7593Author: Tommaso Visconti <[email protected]>Date: Mon Nov 25 16:39:53 2013 +0100! message1

The b843940 commit is gone.. :-(

Page 23: GIT: Content-addressable filesystem and Version Control System

Basic commandsPorcelain

Page 24: GIT: Content-addressable filesystem and Version Control System

Background: refspecA refspec is a mapping between remote and local branch

An example is the fetch entry in a remote config:[remote "origin"] url = [email protected]:schacon/simplegit-progit.git fetch = +refs/heads/*:refs/remotes/origin/*

Format: +<src>:<dst>

<src>: where those references will be written locally<dst>: pattern for references on the remote side+: (optional) update the reference even if it isn’t a fast-forward

Page 25: GIT: Content-addressable filesystem and Version Control System

Background: ancestry references

Given a ref (eg. HEAD):

• HEAD^ is the first parent of HEAD• HEAD^2 is the second parent (and so on..)• HEAD~ is identical to HEAD^1• HEAD~2 is is the parent of the parent of HEAD

^ is useful for merging, where a commit has two or more parents

Page 26: GIT: Content-addressable filesystem and Version Control System

Background: rangesgit log <refA>..<refB>

All commits reachable by refB that aren’t reachable by refA

Commits in refB not merged in refA

Synonyms: git log ^refA refB

git log refB --not refAgit log refA refB ^refC

!The last is useful when using more refs

Page 27: GIT: Content-addressable filesystem and Version Control System

Background: ranges

git log <refA>…<refB>All commits either reachable by refB and refA

but not both of them

Commits to be merged between the two refs

Page 28: GIT: Content-addressable filesystem and Version Control System

Workflow

Page 29: GIT: Content-addressable filesystem and Version Control System

Basic commands - 1

$ git add -i staged unstaged path 1: unchanged +1/-1 lipsum!*** Commands *** 1: status 2: update 3: revert 4: add untracked 5: patch 6: diff 7: quit 8: help

git add --interactive

Really powerful, not really user-friendly

Page 30: GIT: Content-addressable filesystem and Version Control System

Basic commands - 2

Commit description (visible with —oneline) max 50 chars!Commit full message. Breakline at 72nd charBuzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword!Commit footerCommands, eg: fixes #<BUG>ecc.

git commit and commit message

git commit --amend

Page 31: GIT: Content-addressable filesystem and Version Control System

Basic commands - 3git log

# Short diffgit log --oneline!# Line diff between filesgit log -p!# Word diffgit log -p —word-diff

# Diff stats without linesgit log --stat!# Pretty diff with treegit log --pretty=format:'%h %s' --graph

git diff# Diff with remotegit diff HEAD..origin/master

.gitignore# Exact match (absolute path)/public/README!# All matchespublic/READMEREAD*

Page 32: GIT: Content-addressable filesystem and Version Control System

Useful toolsgit blame

Show the author of each line

git aliasCreate command aliases

git bisectFind the problematic commit

git filter-branchHard change of history

to delete committed passwords

git format-patchCreate a patch file ready to be

sent to somebody

git request-pullImplement pull-request workflow

Page 33: GIT: Content-addressable filesystem and Version Control System

Git stashSave unstaged changes for future reuse

git unstash doesn’t exist but:

git stash show -p stash@{0} | git apply -R

Create an alias to do it:

git config --global alias.stash-unapply '!git stash show -p | git apply -R'

Create a branch from stash:git stash branch <BRANCH>

Page 34: GIT: Content-addressable filesystem and Version Control System

Git resetMove HEAD to specific state (e.g. commit)--soft Move only HEAD

--mixed Move HEAD and reset the staging area to it (default)

--hard Move HEAD and reset staging area and working tree

git reset [option] <PATH>Don’t move HEAD, but reset the PATH (staging area and working tree,

depending from option)

Page 35: GIT: Content-addressable filesystem and Version Control System

Push

git push <remote> <refspec>

git push origin master

git push origin development:master

git push origin :development

refspec format => <src>:<dst>

git help push

Page 36: GIT: Content-addressable filesystem and Version Control System

Fetch

[remote "origin"] url = [email protected]:schacon/simplegit-progit.git fetch = +refs/heads/master:refs/remotes/origin/master fetch = +refs/heads/qa/*:refs/remotes/origin/qa/*

Sync remotes status: git fetch <remote>

Uses .git/config to know what to update

After checking a remote status we can:• merge• rebase• cherry-pick

Page 37: GIT: Content-addressable filesystem and Version Control System

Pull

git pull origin master

git fetch origingit merge origin/master

=

In both cases the master branch must be checkouted

Page 38: GIT: Content-addressable filesystem and Version Control System

Fast-forward mergeWhen a branch has commits which are direct predecessor

of the branch where we’re merging, this is a so calledfast-forward merge

To merge iss53 to master, git just advances master

pointer to C3 without creating a commit

object. This is a fast-forward merge

Use merge --no-ff to force creation of a commit object

Page 39: GIT: Content-addressable filesystem and Version Control System

3-way mergeWhen the commit to be merged isn’t a direct predecessor

of the commit to merge into

C5 is a new commit object, including merge informations

Page 40: GIT: Content-addressable filesystem and Version Control System

Merge strategies

recursive: default when merging a single head (can have sub-options)

octopus: default otherwise

resolve

ours

subtree

Page 41: GIT: Content-addressable filesystem and Version Control System

The experiment’s commits (C3) become a single commit (as a patch) C3’ which is applied to master

It’s a rebase on experiment, then a FF merge on master

git checkout experimentgit rebase master # C3’ is createdgit checkout mastergit merge experiment # FF merge

Rebase

Page 42: GIT: Content-addressable filesystem and Version Control System

Advanced use

Page 43: GIT: Content-addressable filesystem and Version Control System

Advanced rebase

We want C8 and C9 on master

Page 44: GIT: Content-addressable filesystem and Version Control System

Advanced rebase

git rebase --onto master server client“Check out the client branch, figure out the patches from the common ancestor

of the client and server branches, and then replay them onto master”

Page 45: GIT: Content-addressable filesystem and Version Control System

Advanced rebaseNow perform a simple fast-forward merge on

master to advance it to C9’

C8’+C9’, originally made on C3, on C6 can break code!

Page 46: GIT: Content-addressable filesystem and Version Control System

Advanced rebase

With rebase the rebased commits become only one, the commit history is lost

With interactive rebase is possible to edit history, changing commit messages, squashing, splitting,

deleting and reordering commits

Page 47: GIT: Content-addressable filesystem and Version Control System

Cherry-pick

$ git checkout master$ git cherry-pick e43a6fFinished one cherry-pick.[master]: created a0a41a9: “Cherry-pick example" 3 files changed, 17 insertions(+), 3 deletions(-)

Page 48: GIT: Content-addressable filesystem and Version Control System

SubmodulesUse other repositories in your own

e.g. an external lib in ./lib/<libname>

git submodule add [-b <BRANCH>] <REPO> <PATH>

Submodules track a commit or a branch (from 1.8.2)

Submodules informations are stored in .gitmodules and .git/modules.

.gitmodules must be committed at the beginning and after a submodule update

Page 49: GIT: Content-addressable filesystem and Version Control System

Subtree merging

Can be used in place of submodules

Is a way to track a branch in an another branch folder

RepositoryFolder/\ |_lib/ \ |_libA/

RepositoryFolder is on master branch and libA/ is

the lib_a branch

Page 50: GIT: Content-addressable filesystem and Version Control System

Workflow examples

http://git-scm.com/book/en/Distributed-Git-Contributing-to-a-Project

Page 51: GIT: Content-addressable filesystem and Version Control System

Git flowmaster contains production code

and is tagged with release

development contains dev code

create a new feature from development

the feature merges on development

development becomes a release

a release merges on master

an hotfix is a branch from master and merges on master and

development

If you write code on master or development, you’re wrong!

Page 52: GIT: Content-addressable filesystem and Version Control System

Problem solving time