GIT: Content-addressable filesystem and Version Control System

Post on 06-May-2015

3.458 views 3 download

description

Git presentation, internals, advanced use and workflow examples. Presentation by Tommaso Visconti http://www.tommyblue.it for DrWolf srl http://www.drwolf.it

Transcript of GIT: Content-addressable filesystem and Version Control System

GITContent-addressable filesystem

and Version Control System

@tommyblue - www.tommyblue.it

From Nofu to Shogun

• Introduction

• Basic commands

• Advanced commands

• Git internals (optional)

• Introduction

• Git internals

• Basic commands

• Advanced commands

• Fabbri’s problem-solving

Standard presentation This presentation

Pizzu’s

GIT

Content-addressable filesystem + Tools to manage them (porcelains) = —————————————————

Version Control System

CVS/SVN use deltas

With long histories, becomes heavy to calculate current files state

Git uses snapshots

• A version is a tree (like a FS) using hashes as nodes • Every version is a snapshot of the full repository, with all files • A file is identified by a SHA-1 hash, depending on the file content • If the content doesn’t change, the hash is the same

Git internalsPlumbing

The .git folderEverything is in the .git folder (delete it to delete repo)

.git/hooks/info/objects/ => repo contentrefs/ => commit objects’ pointersconfigdescriptionindex => stage infosHEAD => checkouted branch

Git objects

All the object are identified by an hash

Git works like a key-value datastore

When you save an object in Git, it returns its hash

Objects !

Blob Tree Commit TAG

Blob objects - 1

Essentially the committed file with its content

# Save a simple text file (without -w it calculates the hash)$ echo 'test content' | git hash-object -w --stdind670460b4b4aece5915caf5c68d12f560a9fe3e4

# The created object (notice the folder structure)$ find .git/objects -type f .git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4

# Extract the content using the hash as reference$ git cat-file -p d670460b4b4aece5915caf5c68d12f560a9fe3e4test content

Blob objects - 2# New version of the file$ echo 'version 1' > test.txt$ git hash-object -w test.txt 83baae61804e65cc73a7201a7252750c76066a30

# Now there are two objects$ find .git/objects -type f .git/objects/83/baae61804e65cc73a7201a7252750c76066a30.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4

# Restore the old version$ git cat-file -p d670460b4b4aece5915caf5c68d12f560a9fe3e4 > test.txt

Tree objectsContains references to its children (trees or blobs), like a UNIX folder

Every commit (snapshot) has a different tree

# Adds a file to the index (staging area)$ git update-index --add --cacheinfo 100644 \ a906cb2a4a904a152e80877d4088654daad0c859 README

# Show the tree object content$ git cat-file -p 1f7a7a472abf3dd9643fd615f6da379c4acb3e3a100644 blob a906cb2a4a904a152e80877d4088654daad0c859 README100644 blob 8f94139338f9404f26296befa88755fc2598c289 Rakefile040000 tree 99f1a6d12cb4b6f19c8655fca46c3ecf317074e0 lib

# Write the tree object (containing indexed/staged files)$ git write-tree1f7a7a472abf3dd9643fd615f6da379c4acb3e3a$ git cat-file -t 1f7a7a472abf3dd9643fd615f6da379c4acb3e3atree

Commit objects - 1Commit message, other informations (GPG) and reference to a

tree object (through its hash)

# First commit object given a tree object (1f7a7a)$ echo 'first commit' | git commit-tree 1f7a7afdf4fc3344e67ab068f836878b6c4951e3b15f3d

# Let’s check the commit content$ git cat-file -p fdf4fc3tree 1f7a7a472abf3dd9643fd615f6da379c4acb3e3aauthor Tommaso Visconti <tommaso.visconti@drwolf.it> 1243040974 -0700committer Tommaso Visconti <tommaso.visconti@drwolf.it> 1243040974 -0700!first commit

TAG objectsLightweight tag

Like a commit with reference to a commit object (not a tree)

Annotated tagGit creates a tag object and the ref tag pointing to its hash

$ git tag -a v1.1 1a410efbd13591db07496601ebc7a059dd55cfe9 -m 'test tag’!$ cat .git/refs/tags/v1.19585191f37f7b0fb9444f35a9bf50de191beadc2!$ git cat-file -p 9585191f37f7b0fb9444f35a9bf50de191beadc2object 1a410efbd13591db07496601ebc7a059dd55cfe9type committag v1.1tagger Tommaso Visconti <tommaso.visconti@drwolf.it> Mon Nov 25 15:56:23 2013!test tag

References - 1

Useful to avoid using and remembering hashes

Stored in .git/refs/

Text files just containing the commit hash

References - 2Branches are references stored in .git/refs/heads

# Update master ref to the last commit$ git update-ref refs/heads/master 1a410e

Lightweight tags are refs placed in .git/refs/tags

Remotes are refs stored in.git/refs/remotes/<REMOTE>/<BRANCH>

!The commit hash of a remote is the commit of that branch

the last time the remote was synchronized (fetch/push)

Repository overview

How GIT knows which branch or commit is checkouted?

# HEAD is a symbolic reference$ cat .git/HEAD ref: refs/heads/master

# Get HEAD value using the proper tool$ git symbolic-ref HEADrefs/heads/master

# Set HEAD$ git symbolic-ref HEAD refs/heads/test$ cat .git/HEAD ref: refs/heads/test

HEAD - 1

HEAD generally points to the checked out branch

Committing in this situation means set a new hash to the current branch and, as a consequence, HEAD points to it

If HEAD points to a commit and not a branch, it’s the so called Detached HEAD

Committing in this situation means advance HEAD to the new commit hash, without modifying any branch

HEAD - 2

HEAD can be moved with git reset

Detached HEAD - 1$ git add .$ git commit -m message1[master (root-commit) 3065781] message1 1 file changed, 1 insertion(+) create mode 100644 test.txt

$ cat .git/HEADref: refs/heads/master

$ cat .git/HEADref: refs/heads/master

$ cat .git/refs/heads/master30657817081f6f0808bf37da470973ad12cb7593

$ echo prova > test2.txt$ git add .$ git ci -m test2[master 33676f0] test2 1 file changed, 1 insertion(+) create mode 100644 test2.txt

$ cat .git/refs/heads/master33676f027d9c36c66f2a2d5d74ee1cbf3e1ff56b

$ cat .git/HEADref: refs/heads/master

Detached HEAD - 2

$ git checkout 30657817081f6f0808bf37da470973ad12cb7593

Note: checking out '30657817081f6f0808bf37da470973ad12cb7593'.!You are in 'detached HEAD' state. You can look around, make experimentalchanges and commit them, and you can discard any commits you make in thisstate without impacting any branches by performing another checkout.!If you want to create a new branch to retain commits you create, you maydo so (now or later) by using -b with the checkout command again. Example:! git checkout -b new_branch_name!HEAD is now at 3065781... message1

$ cat .git/HEAD30657817081f6f0808bf37da470973ad12cb7593

$ echo 'detached' > det.tx$ git add .

$ git commit -m 'det commit'

Detached HEAD - 3

[detached HEAD b843940] det commit 1 file changed, 1 insertion(+) create mode 100644 det.txt

$ git log

commit b8439407b2524b55ceba814a9eef2ee92655a0bbAuthor: Tommaso Visconti <tommaso.visconti@gmail.com>Date: Mon Nov 25 16:49:05 2013 +0100! det commit!commit 30657817081f6f0808bf37da470973ad12cb7593Author: Tommaso Visconti <tommaso.visconti@gmail.com>Date: Mon Nov 25 16:39:53 2013 +0100! message1

Detached HEAD - 4$ git checkout master

$ git log

Warning: you are leaving 1 commit behind, not connected toany of your branches:! b843940 det commit!If you want to keep them by creating a new branch, this may be a good timeto do so with:! git branch new_branch_name b843940!Switched to branch 'master'

commit 33676f027d9c36c66f2a2d5d74ee1cbf3e1ff56bAuthor: Tommaso Visconti <tommaso.visconti@gmail.com>Date: Mon Nov 25 16:40:31 2013 +0100! test2!commit 30657817081f6f0808bf37da470973ad12cb7593Author: Tommaso Visconti <tommaso.visconti@gmail.com>Date: Mon Nov 25 16:39:53 2013 +0100! message1

The b843940 commit is gone.. :-(

Basic commandsPorcelain

Background: refspecA refspec is a mapping between remote and local branch

An example is the fetch entry in a remote config:[remote "origin"] url = git@github.com:schacon/simplegit-progit.git fetch = +refs/heads/*:refs/remotes/origin/*

Format: +<src>:<dst>

<src>: where those references will be written locally<dst>: pattern for references on the remote side+: (optional) update the reference even if it isn’t a fast-forward

Background: ancestry references

Given a ref (eg. HEAD):

• HEAD^ is the first parent of HEAD• HEAD^2 is the second parent (and so on..)• HEAD~ is identical to HEAD^1• HEAD~2 is is the parent of the parent of HEAD

^ is useful for merging, where a commit has two or more parents

Background: rangesgit log <refA>..<refB>

All commits reachable by refB that aren’t reachable by refA

Commits in refB not merged in refA

Synonyms: git log ^refA refB

git log refB --not refAgit log refA refB ^refC

!The last is useful when using more refs

Background: ranges

git log <refA>…<refB>All commits either reachable by refB and refA

but not both of them

Commits to be merged between the two refs

Workflow

Basic commands - 1

$ git add -i staged unstaged path 1: unchanged +1/-1 lipsum!*** Commands *** 1: status 2: update 3: revert 4: add untracked 5: patch 6: diff 7: quit 8: help

git add --interactive

Really powerful, not really user-friendly

Basic commands - 2

Commit description (visible with —oneline) max 50 chars!Commit full message. Breakline at 72nd charBuzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword buzzword!Commit footerCommands, eg: fixes #<BUG>ecc.

git commit and commit message

git commit --amend

Basic commands - 3git log

# Short diffgit log --oneline!# Line diff between filesgit log -p!# Word diffgit log -p —word-diff

# Diff stats without linesgit log --stat!# Pretty diff with treegit log --pretty=format:'%h %s' --graph

git diff# Diff with remotegit diff HEAD..origin/master

.gitignore# Exact match (absolute path)/public/README!# All matchespublic/READMEREAD*

Useful toolsgit blame

Show the author of each line

git aliasCreate command aliases

git bisectFind the problematic commit

git filter-branchHard change of history

to delete committed passwords

git format-patchCreate a patch file ready to be

sent to somebody

git request-pullImplement pull-request workflow

Git stashSave unstaged changes for future reuse

git unstash doesn’t exist but:

git stash show -p stash@{0} | git apply -R

Create an alias to do it:

git config --global alias.stash-unapply '!git stash show -p | git apply -R'

Create a branch from stash:git stash branch <BRANCH>

Git resetMove HEAD to specific state (e.g. commit)--soft Move only HEAD

--mixed Move HEAD and reset the staging area to it (default)

--hard Move HEAD and reset staging area and working tree

git reset [option] <PATH>Don’t move HEAD, but reset the PATH (staging area and working tree,

depending from option)

Push

git push <remote> <refspec>

git push origin master

git push origin development:master

git push origin :development

refspec format => <src>:<dst>

git help push

Fetch

[remote "origin"] url = git@github.com:schacon/simplegit-progit.git fetch = +refs/heads/master:refs/remotes/origin/master fetch = +refs/heads/qa/*:refs/remotes/origin/qa/*

Sync remotes status: git fetch <remote>

Uses .git/config to know what to update

After checking a remote status we can:• merge• rebase• cherry-pick

Pull

git pull origin master

git fetch origingit merge origin/master

=

In both cases the master branch must be checkouted

Fast-forward mergeWhen a branch has commits which are direct predecessor

of the branch where we’re merging, this is a so calledfast-forward merge

To merge iss53 to master, git just advances master

pointer to C3 without creating a commit

object. This is a fast-forward merge

Use merge --no-ff to force creation of a commit object

3-way mergeWhen the commit to be merged isn’t a direct predecessor

of the commit to merge into

C5 is a new commit object, including merge informations

Merge strategies

recursive: default when merging a single head (can have sub-options)

octopus: default otherwise

resolve

ours

subtree

The experiment’s commits (C3) become a single commit (as a patch) C3’ which is applied to master

It’s a rebase on experiment, then a FF merge on master

git checkout experimentgit rebase master # C3’ is createdgit checkout mastergit merge experiment # FF merge

Rebase

Advanced use

Advanced rebase

We want C8 and C9 on master

Advanced rebase

git rebase --onto master server client“Check out the client branch, figure out the patches from the common ancestor

of the client and server branches, and then replay them onto master”

Advanced rebaseNow perform a simple fast-forward merge on

master to advance it to C9’

C8’+C9’, originally made on C3, on C6 can break code!

Advanced rebase

With rebase the rebased commits become only one, the commit history is lost

With interactive rebase is possible to edit history, changing commit messages, squashing, splitting,

deleting and reordering commits

Cherry-pick

$ git checkout master$ git cherry-pick e43a6fFinished one cherry-pick.[master]: created a0a41a9: “Cherry-pick example" 3 files changed, 17 insertions(+), 3 deletions(-)

SubmodulesUse other repositories in your own

e.g. an external lib in ./lib/<libname>

git submodule add [-b <BRANCH>] <REPO> <PATH>

Submodules track a commit or a branch (from 1.8.2)

Submodules informations are stored in .gitmodules and .git/modules.

.gitmodules must be committed at the beginning and after a submodule update

Subtree merging

Can be used in place of submodules

Is a way to track a branch in an another branch folder

RepositoryFolder/\ |_lib/ \ |_libA/

RepositoryFolder is on master branch and libA/ is

the lib_a branch

Workflow examples

http://git-scm.com/book/en/Distributed-Git-Contributing-to-a-Project

Git flowmaster contains production code

and is tagged with release

development contains dev code

create a new feature from development

the feature merges on development

development becomes a release

a release merges on master

an hotfix is a branch from master and merges on master and

development

If you write code on master or development, you’re wrong!

Problem solving time