Git 101 – Everything you didn’t know that you needed to know about git



Git 101 – Everything you didn’t know that you needed to know about git

0 0


git_101

presentation for reputation.com engineering on general git topics

On Github szeller / git_101

Git 101

Everything you didn’t know that you needed to know about git

What is git?

A distributed revision control and source code management (SCM) system with an emphasis on speed, data integrity, and support for distributed, non-linear workflows.
  • Developed by Linus Torvalds for managing the Linux kernel source
  • Built as a replacement for BitKeeper when the licensing changed

Seriously, Shawn, what is it?

  • A database of files, file diffs, and directory information which each git object chained together in a tree of objects stretching from the first commit to the latest
  • A set of command line utilities for managing the state of the database
  • The utilities also manage the state of a single working copy per checkout
  • Each checkout contains all the info necessary to construct the working state of any commit

What git isn't ...

User friendly Opinionated SVN Straightforward Github *

* However, GitHub is the reason we are all using git today

Beginner git mistakes

Learning commands vs learning flows Not understanding how things work Not remembering what branch they are on Using merge instead of rebase Not using branches enough

Topics you should look up later

Understand all the options to git log - e.g.
git log -2 -w -p origin/master^
Install git-up - https://github.com/aanand/git-up Learn how to set up aliases in your ~/.gitconfig

Everything is a reference

In general, everything that you use to refer to an object (checkin/branch/tag/etc) in git is a reference into the giant tree of commits. Some implications of this:

  • Branches are free since a branch is just a reference that moves whenever you do a commit
  • Tags are immutable references
  • HEAD is a reference to the current state of your working copy
  • When you delete a local branch, it doesn't actually go away
  • Any local commit can be recovered since whatever you do to your local view becomes a new commit

Reference syntax

* ff2f4cd | (HEAD, foo) checkin some even better stuff (2 minutes ago) (Shawn Zeller)
* 4ba0f66 | checkin some great stuff (2 minutes ago) (Shawn Zeller)
* 337330e | (origin/develop, develop) restore /registry since that seems to be what we use for the health check (4 days ago) (Shawn Zeller)
*   d778846 | merge autoZK to dev (8 days ago) (austin chau)
|\
| * a205cc4 | use latest debian plugin (8 days ago) (austin chau)
						

337330e HEAD^^ HEAD~2 develop origin/develop 337330e3c25f6c615f0365cc592ed6f0a815fbf1 ... all the same commit

Fun things you can do with git

  • Diff state between any two commits
    git diff -w cd78ce87e1a HEAD main.coffee
  • Create a branch from any commit
    git branch a/cool/branch/name cd78ce87e1a
  • Remove all .deb files from the history of the repo
    git filter-branch -f --index-filter 'git rm --cached --ignore-unmatch *.deb' --prune-empty --tag-name-filter cat -- --all
  • Turn a subdirectory of a git repo into a top level repository
    git filter-branch --prune-empty --subdirectory-filter some_cool_folder master
  • Deploy code!
    git push heroku master
  • Host a static website - https://pages.github.com

You aren't branching enough

History of shared working brances should be treated as immutable once your commits have been shared. Feature/working branches allow for:

Making sure that you don't accidentally break the build Storing incremental work on github so that you can more easily show it to someone else Making as many incremental commits as you want Cleaning up your commit history later In general, all work should be on a branch

Modifying branch history the easy way - git reset

* ff2f4cd | (HEAD, foo) checkin some even better stuff (2 minutes ago) (Shawn Zeller)
* 4ba0f66 | checkin some great stuff (2 minutes ago) (Shawn Zeller)
* 337330e | (origin/develop, develop) restore /registry since that seems to be what we use for the health check (4 days ago) (Shawn Zeller)
						

  • That last checkin sucked, delete it!
    git reset --hard HEAD^
  • Let's remove the last commit but keep the changes
    git reset HEAD^
  • Let's make one commit out of everything that isn't in develop already
    git reset develop
    git add .
    git commit -m "all my stuff"

Git Merge

* 337330e | (origin/develop, develop) restore /registry since that seems to be what we use for the health check (4 days ago) (Shawn Zeller)
*   d778846 | merge autoZK to dev (8 days ago) (austin chau)
|\
| * a205cc4 | use latest debian plugin (8 days ago) (austin chau)
						
Pull another working tree into your tree. Essentially does the following:

Figure out what is the latest commit that both trees share Apply all the changes to the working copy Optional: Allow the user to resolve any conflicts Create a merge commit with the results of the merge that references both trees

Note: In the case of trivial merges, git updates the current branch to simply reference the HEAD commit for the branch to merge.

Git Merge sounds great ... but

Warning: Running git merge with non-trivial uncommitted changes is discouraged: while possible, it may leave you in a state that is hard to back out of in the case of a conflict.

What this really means is that reverting merge commits is a pain. Also, it makes unraveling the history of the repo, equally painful.

So, Shawn, what should we do to manage all of these branches that you tell us to have?

Git Rebase!

Yep, git rebase deserves an extra large title and transition slide

What is rebase?

Forward-port local commits to the updated upstream head

Like many git commands, you can use rebase for a lot of things. However, there are two main uses for rebase.

In interactive mode, it allows you to reorder, merge, alter, etc. some set of commits. Used instead of merge, it allows you to take all of your commits and place them at the end of a tree

Interactive rebase example

pick 2cad4d7 add some entries to .gitignore
pick 337330e restore /registry 
pick 4ba0f66 checkin some great stuff
pick ff2f4cd checkin some even better stuff

# Rebase ac0b6a7..ff2f4cd onto ac0b6a7
#
# Commands:
#  p, pick = use commit
#  r, reword = use commit, but edit the commit message
#  e, edit = use commit, but stop for amending
#  s, squash = use commit, but meld into previous commit
#  f, fixup = like "squash", but discard this commit's log message
#  x, exec = run command (the rest of the line) using shell
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

Branch maintenance via rebase

You have a feature branch named foo branched off of master. Since branching, more commits went into master.

git checkout foo
git rebase master
  • Rewind your branch (foo) back to the last point in the tree the branches share
  • Pull everything from master into foo
  • Apply each change that was rewound away to the branch again, one by one
  • Optional: Whenever there are conflicts, the user resolves the conflicts
  • Once all resolution is done, your branch has all of your work after every commit in master.
git reset master
git add .
git ci -m "one commit with everything that was in foo"

Rebase Caveats

  • All of the rebased commits will have different hashes since they are a part of a different subtree
  • Only rebase a branch that isn't shared since you will make everyone else's life painful
  • Once you rebase, if you want to push to github, you'll have to use the -f option
  • Rebase into feature branches and use merge for pull feature branches into shared branches

Questions?