Version control with Github August 26th, 2014 Daniel Schreij VU Cognitive Psychology departement .

Post on 27-Dec-2015

216 views 0 download

Tags:

Transcript of Version control with Github August 26th, 2014 Daniel Schreij VU Cognitive Psychology departement .

Version control with GithubAugust 26th, 2014

Daniel Schreij

VU Cognitive Psychology departement

http://ems.psy.vu.nl/userpages/data-analysis-course

What is version control?“Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later. Even though version control is mainly used for software source code, in reality any type of file on a computer can be placed under version control.”

Git• Designed by Linus Torvalds, the creator of Linux, to

keep track of the many changes and additions that were made to the Linux kernel

• One of the most accessible and easy to adopt VCS of the moment– Others are mercurial (hg) or subversion (svn)

• Originally a command-line system, but there now are a lot of good GUI tools available too

• Like “track changes” for Word, but then for software code (and with full history)

Github• A website on which repositories (or gits) can be stored• Website offers pretty graphical overviews of

everything concerning your repositories• Public repositories are free, private repositories need

to be paid for• … but not if you register as an academic• Good starting point for projects which will be a

collaborative effort• Like dropbox (but then a bit more labor-intensive)

Github | GUIGet the client from http://github.com

and in it, create a Github account (optional)

Git basics | .git file• Git(hub) puts a hidden folder (.git) in the folder

you want to be tracked (e.g. want to be a Git repository)

• In this .git folder, a history of all tracked changes is kept. If you move your folder and this .git folder moves with it: no harm done

• (Accidentally) erase this .git folder however… and the log of your changes is lost (but luckily you can often just download it from gihub.com again)

Git basics | Creating a repository• Use the GUI!

• Or: from the command line go to the folder in which you want the files to be tracked by Git and initialize it as a Git repository

> cd </somewhere/somefolder>> git init

Git basics | Add files to repo• Let’s place a file practice.py containing the following code in the folder:words = ["This","is","a","test"] sentence = ",".join(words) • If we now check our repo in the Github GUI, it will have detected this file!

Indicates file is just added

Repo is not on Github yet (needs to be published first)

Git basics | Commit from GUI• File has been detected,

but its state has not been saved (you cannot revert to its current state if you make changes now)

• You need to commit your changes

• Enter a short (!) description of the changes in the top field and (optionally) a larger one in the bottom box

Git basics | Commit from CLI• If you work from the terminal/console, you can commit files with git commit, but first you need to tell Git to start tracking the files

> git add practice.py [<file 2> … <file n>]or > git add –A (to track all new files)

• Now you can commit your changes

> git commit –a

This will open a text editor in which you can enter your commit message (just as in the GUI)

Git basics | Commit from GUIGUI will now list entry of commit in history

Long description

Short commit description

Git basics | Moving on…• Let’s make some (drastic?) changes to practice.py

# Here I say something about this file # @author Daniel Schreij (d.schreij@vu.nl) words = ["This","is","a","test"] sentence = " ".join(words) print sentence

• After saving the file, these changes will be registered by Git and shown in the GUI

Git basics | Moving on…

• And after committing these changes again…

Git basics | Commited again…

From CLI just enter git commit –a again (no need for git add)

Revert to this commit

No. of changes and ratio:additions vs. deletions vs. unchanged

Git basics | 1000 commits later

Git basics | Publishing

• We all know we like publishing, so time to publish our repository on Github.com– Done with the button in the top-right, and that’s it!– From then on, this button will have the label Sync

• From the CLI it’s some more work:create remote repositorygit remote add <name> <url> (https://github.com/dschreij/DAT.git)

copy(/push) data to remote repositorygit push <name>

Github| Online

Github | README• README(.md) file in a repository is automatically

displayed on the website and in the Github GUI

• Let’s create one for our repository

Github | README• Format can be plain text or markdown• Markdown is a plain text formatting syntax. For a

overview of the possibilities go to:https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet

• Let’s create a file called README.md in our repo with the contents:

# Practice program**copyright**: *VU DAT group*## AboutThis program has been made to practice Git

Github | READMEAfter committing and syncing/pushing, our page will look like this:

Other users can dowload or clone our repository from here

Our README(.md) appears here

README(.md) added to repo

Git | .gitignore• Some OS’s and programs generate a lot of

garbage– .DS_store, .pyc, ~files, .tmp, etc.

• They might not always be visible to the user, but Git sees them and will take them into the commits

• You can list the files, file extensions and folders to omit in the .gitignore file

Git | .gitignore# Windows image file cachesThumbs.dbehthumbs.db

# Folder config fileDesktop.ini

# Recycle Bin used on file shares$RECYCLE.BIN/

# Windows Installer files*.cab*.msi*.msm*.msp

Git | Branches• Sometimes you want to make some radical changes to

your project, but also want to keep a version as it is.• You can create a new branch and play around with your

changes there• In your repository folder, you can create a new branch in

the CLI with:> git branch <branch_name> (create the branch)> git checkout <branch_name> (switch to it)or shorthand:> git checkout –b <branch_name>(create & switch)

• The new branch will contain a copy of your current branch

Git | Branches

• In the GUI, new branches are created by the button at the top-left labelled master (which is the current and main branch)

• Let’s create a new branch called experimental

Git | Branches• Any commit you make will only apply to the current branch• In the GUI, you can determine which branch you’re in by

checking the label at the top-left• In the CLI you can check by entering> git statusor> git branch* experimental master

• You can switch back to the main branch with> git checkout master

Git | Merging branches• If all works out in the experimental branch, we

might want to apply all its changes to our master branch too

• You can do this by merging branches• Make sure you are in the branch you want to

merge to> git checkout masterand then merge the changes from experimental with> git merge experimental

Git | Merging branches

• It is also possible to merge branches in the GUI by clicking in the branches menu

• There you can drag the individual branches you want to merge to fields at the bottom

Github | Merging branchesOn Github.com you can create network graphs of various branches, and how they originate from, or are merged into, other branches– Also includes branches of collaborators

Git | Merge conflicts• If changes have been made to both of two

branches that are merged, these changes might conflict

• You need to manually resolve these conflicts and then commit the resulting file(s) afterward

• Conflicts can be very nasty, but luckily there are so-called merge tools that make solving them easier

Git | Merge tools• Meld, P4Merge, WinMerge, KDiff3, etc.

Git | Remote repositories

• In its essence, Git is designed to also work with remote repositories (owned by others) to make sharing, distribution of and collaboration on code easier

• The GUI’s options for these remote operations are very limited at the moment, so it’s a good idea to use the command line for this

Git | Cloning• Copy the contents of a

remote repository to a folder on your computer (also called a local copy)

• You cannot contribute back to the upstream repo unless it’s yours or you are indicated as "contributor” by its owner

Github

Git | Cloning• Clone (/make a local copy of) a repository in the GUI, or in

the CLI with> git clone <remote-url>For example> git clone https://github.com/dschreij/Data-Analysis-Toolbox

• To later update your local copy with the latest version of the remote repo go to its folder and type> git pull(or with the GUI sync button)

• Note that this might cause merge conflicts with your local version that you need to resolve

Github | Forking• A fork is a remote

copy (on your account at github.com) of another repository

• Changes you make can (only) be pushed to your own copied repo

Github

Other account Your account

Github | Pull requestsIf you want your changes and/or additions to be merged into the upstream repo, you have to send its owner a pull request

Github

Other account Your account

!

Github | Pull requestsThe owner can then merge all of (or cherry-pick) your commits into his own repo

Github

Other account Your account

Github | Pull requests

Git | Updating forked repo’s• If you want to update your copy with newer versions

from the original repo, this can be done with:> git fetch upstream> git merge upstream/<branch_to_merge>(e.g. git merge upstream/master)

or often> git pullalso just works

• Sometimes the link to the upstream repo is not set yet. Do this with:> git remote add upstream <url_to_upstream_repo>(e.g. git remote add upstream https://github.com/tknapen/analysis_tools)

Git(hub) | Tutorials• For an elaborate list of Git commands with

explanations, have a look athttps://www.atlassian.com/git/tutorial/git-basics

• For an interactive tutorial, together with example files, go tohttp://gitimmersion.com/index.html

• Github’s own documentation is also worth a look:https://help.github.com/articles/set-up-git

Tomorrow

• Numpy + Scipy• Pandas + Matplotlib