Proactively Preventing Git Pickles
Proactively Preventing Git Pickles
As a git enthusiast, I am often called in to rescue co-workers or mentees when they have undisputably (and sometimes irreversibly) painted themselves into a corner with git. In order to provide assistance, I first have to understand what they were trying to do, and then trace through the steps that got them there.
Like a victim of a crime who is being questioned, they’re frustrated that I’m asking them questions instead of out there tracking down the real perpetrator (Linus Torvalds?). After all, there’s this possessed machine and it’s eating their code. Something must be done!
I’m sorry to break the news, but git is not an evil robot intent on destroying your code or career. (I’m pretty sure the uprising won’t be for at least another 25 years.) If you’re a software professional - especially one who uses git at work or in consulting - you need to know how to manage and protect the code you write. I believe everyone gets one free pass for a major git screw-up. After that, you should have learned from what you did wrong.
So, what’s the best way to get better at git? Its documentation is robust, but it can be overwhelming. Tutorials like Atlassian’s are great for concepts. Cheat sheets like Tower’s are helpful on the command line - which is the primary way I recommend using git. These are great resources, but like many other things in technology: In order to get really comfortable with git, you need practice and communication. It’s also helps if, when you do screw up, you’re curious about the problem/solution… rather than, say, throwing your laptop out the window.
That said, I follow some best practices that I think are worth sharing. If you follow these principles, you’ll rarely find yourself up git creek again.
1. Commit early, commit often Push early, push often
You’ve likely heard this mantra before. It’s great advice, but committing alone is insufficient. Pushing backs up your work. If you back up your work every time you have work to be backed up, you’ll always have (you guessed it) a backup. It’s not rocket science, it’s just a good habit.
The act of committing and pushing together should be a mindless ritual, like putting on your seatbelt before you put your car in drive. In fact, I add
->commit
->push
so often as a single act that I have a command-line alias for it.
2. Harness the power of branches
Branching in git is cheap, meaning that it takes almost no time/space for you (nor the git system) to make a branch and switch to it. A branch should represent a logical unit of work. When you start to diverge from that branch’s intention, you should make a new branch. Because branches are really refs, you can think of a new branch name like a snapshot of your code, which allows you easily move backward or forward in the code’s history using a human-readable name.
Let’s say I’m working on a feature, #1234, that is supposed to include the general tasks:
- Add a new field to a model
- Write a migration script
- Expose the new field to the UI
As I’m working on step 1, I see something that I want to fix or do. This "something" is either:
- related to accomplishing step 1, so I make a new branch and call it something like "1234-model-refactor-schema".
- unrelated to accomplishing 1
- but is related to step 2 or 3, I put a TODO in the code as a reminder to come back and address it when I get to those parts.
- is unrelated to issue 1234, so (after thinking long and hard about whether this is worth derailing me from my planned work) I branch from master and call it something like "remove-warnings-in-models".
Branches are like writing your own "Choose your own adventure" book. You set the markers and connect them whenever and wherever it seems appropriate. Use cognizant discretion as you decide which changes are relevant to your planned work and which are not.
3. Pick one: merge or rebase
Teams or architects often have strong opinions about whether everyone should merge or rebase. Not unlike the proverbial "tabs vs spaces" debate, the “merge vs rebase” debate is divisive. My advice is the same for both: It doesn’t matter what you do, but be consistent.
Merging creates a merge commit - a commit that contains the entire contents of the changes - in addition to the commits in the branch being merged. Merging and then rebasing (or vice versa) doesn’t really make sense and will only serve to confuse git... and you.
4. The team’s code comes to you (not the other way around)
You can mitigate the risk that you’ll screw something up if you ensure your code (on the source branch) is completely up to date with the code in the destination branch before merging it in. You do this by 1) fetching and then 2a) merging the destination branch into your source branch or 2b) rebasing your source branch onto the destination branch.
This is important because conflicts will be resolved on your source branch (i.e. issue-1234), not the presumably more important destination branch (i.e. develop or master). You can test your code locally before merging, so no merge-related surprises appear in the main code line. If something is broken, you can easily revert or reset that update and try again. The broken code only affects you, not your entire team.
5. Remotes to the rescue!
Most people just have one remote, called "origin". Perhaps you have another called “upstream”, or one for deploying (i.e. “heroku”). Remotes are simply pointers, with custom nicknames, to a git server somewhere else. The somewhere else part is critical when you have screwed up (or permanently lost) code on your computer and the main repo. For example, you force pushed a bunch of deletions by accident.
Because git is decentralized, remember that everyone working with the repository (including deployment remotes) has a copy of the entire repo on their devices. True, that code might be fairly stale (depending on how often you push and how often they fetch) but it probably exists somewhere. If you’ve accidentally obliterated some code, you should own up to it quickly to prevent the problem from spreading and find some cloned instance of the repo that has a copy you can reinstate.
It’s reassuring to have a remote to another git server and push there regularly as well. That might be a forked repo whose remote is called “mine”, or a repo on a different server (i.e. BitBucket) called “backup”. The repo and remote names can be whatever you want, and you can have a script that automatically pushes to it once a day. (Note that, assuming everyone is working from the same repo, a fork is simply a convenient backup mechanism. Otherwise they tend to clutter and confuse the workflow.)
6. Have a "repo playground"
The fear, worry, and frustration a lot of people have with git is partly due to the fact that git doesn’t always behave the way you think it will/should. I don’t know about you, but I don’t want my code to be a guinea pig in a git experiment. I want to be sure what’s going to happen before I pull the trigger.
It’s helpful to have a couple of repositories that you just don’t care about at all. They could contain text files or a similarly-stacked application, possibly generated with a scaffolding tool like Yeoman. You can get comfortable with git by making inconsequential changes, checking them in, merging them, reverting them, etc. You can model out your exact scenario (i.e. coworker changed a JS function in this way and I changed it some other way), then see what happens when you merge/cherry pick/rebase/etc.
7. Protect yourself from yourself
Branch protection and rules are a very important part of your git process. Not protecting your branch is like not locking the front door of your house. Not enforcing branch rules is like not enforcing rules about who can come in your house. I’ve encountered a surprising amount of pushback on this concept, explaining that good managers trust their developers and people should do the right thing.
I'm not paranoid, but the truth is that people are fallible. I don’t believe that a thief will try to get into my house, but I’m still not going to leave my front door unlocked. While a thief is malicious, developers are trying to do the right thing. When someone accidentally pushes code to the wrong place, it’s a genuine mistake... and everyone makes mistakes.
Lock down* the master branch (and possibly others, like develop or QA) so that only a few leads can push to them directly. Fewer still should be able to force push. Ensure the branch people merge into most often (i.e. develop) is set as the default branch - this prevents people from accidentally opening a pull request to the wrong place. If you have unit tests or linters, ensure those things run and pass before a pull request can be merged. Configure all of your repositories this way, even if you’re the only collaborator.
*Different git cloud services use different terminology for this. For example, GitHub calls it “protected branches” while Bitbucket Server calls it “branch permissions”.
8. Follow the KISS principle
As I continue to observe how different organizations use git, I believe that most have overcomplicated their process at either a team or individual level. The KISS principle says "keep it simple, stupid."
Like other areas in technology, there are purist/dogma types who insist on following a process because they read it in a book, and there are rebel/freethinking types who think there is nothing to be learned from such nonsense. The best strategies lie somewhere in between, understanding that git can help you do almost anything with your code, but just because you can doesn’t mean you should.
Examples of this include:
- superfluous branches
- multiple branches with too-similar names
- allowing old branches to linger and get stale
- attempting to merge three branches at once (aka a 3-way merge)
- merging one branch into multiple places
- feature branches that exist for more than a few weeks
- coding in a silo, knowing that the code is highly dependent on another branch
These are behaviors that can easily lead to conflicts and mistakes. Someone who can observe the code at this level can help the team avoid problems before they occur. There is no "one size fits all" solution, but it never hurts to be clear about the things you should NOT do, and course correct when you see them happening.
I hope you’ve picked up a few tricks that can help you avoid git pickles. By using git thoughtfully and deliberately, you can get a lot of work done without a lot of overhead or drama. Now, go git crackin!
Lyndsey Padget is a NodeJS developer in Kansas City, Missouri. She speaks about and offers consulting/training on Git, REST, Test Driven Development, discovering your inner badass, team productivity, and more.