Nouvelle Génération: Something Beautiful About Programming

What if I told you ...

Go on...

... that you could have a record of every change made to the many, many documents that go into your codebase ...

Well that’s interesting. I mean, I could use that for all kinds of things. I could use that to find out where bugs entered the system, for example.

... and a record of who made them, up to the minute, a permanent record ...

Well that is powerful. I could use it to review the progress my team was making, if I was a manager, by looking at the changes every day.

.. and every change could be reviewed by anyone, in a totally transparent way ...

Everyone can keep up with all the changes and understand how the code is evolving? Every change?

.. and you can bundle changes and turn them into branches, and anyone can make as many branches as needed, without violating the integrity of the other branches ...

I’d say this sounds like some insane parallel-universe fantasy. Someone can completely change the code without interfering with the work of others?

... and then you could merge a finished branch back into the main trunk of code, reviewing and fixing inconsistencies as you go ...

My God, it’s like you can hear inside my brain. So even though my code is a huge pile of fragile, interdependent components, I can have my code team off working in their own branches and then, because I have spent the time to have a solid testing suite, we can, at the appropriate time, merge their changes and run automated tests to make sure that everything is still working.

... and everyone can have the history of every change ever made to the code, even if the codebase is decades old ...

Shut up, shut up, and take everything, disembodied code voice, take everything! Take me!

... and it’s all completely, totally free to download and is the default way of distributing source code throughout the world!

Programmer A

... and it’s all completely, totally free to download and is the default way of distributing source code throughout the world!

(Faints.)

And that’s why everyone gets excited about GitHub. You should go to GitHub, you really should. You should poke around and look through the thousands of repositories there, read some of the README files. And you should look into the code, and then look at the commits. A “commit” is a moment of action captured and stored. You can compare one commit with another and see a “diff,” see what’s been added and what’s been removed. See what you can figure out. Take a look at the screen shot below.

First, we’re looking at the Django repository. This is the actual, real-life code that makes Django, the Web framework, run. It has 668 people keeping an eye on it, and 14,325 people have starred it as a favorite, and there are 5,692 forks—meaning that people have copied the code into their own repositories with some intention of manipulating and adding to or changing it. These numbers represent invested users. There are likely hundreds of thousands more who downloaded the code just to use it.
We see that a user, claudep, has checked in some code. He did this five hours ago, adding a “commit message” that reads “Fixed #24826—Accounted for filesystem-dependent filename max length.” He’s working in a file called tests.py, which means that this particular new code (marked in green and prefixed by “+” at the beginning of each line) is probably either test code or code to support tests. And thanks to user claudep, this code is now better than it was six hours ago.
This is the experience of using version control. It’s a combination news feed and backup system. GitHub didn’t invent version control. It took a program called git,

which had been developed at first by Linus Torvalds, the chief architect of Linux, and started adding tools and services around it. The way git works is that you can copy the code and all the changes ever made to the code with one command:

git clone git@github.com:nodejs/node.git

That will copy all of the code that is and was in Node.js to your local machine. Now you can go in and change that code to your heart’s delight. When you’re done changing it, you can type

git add .

which adds the files that you changed; and then

git commit

which asks you to enter a commit message explaining what you’ve done; and then

git push origin master

which will cause an error because who do you think you are to come in and start pushing code to the node repository? But if you did have permission, that would push your changes to the master branch of the git repository that is hosted on GitHub.

Github Repositories

These commands are now part of the sense memory of many programmers. They type variations on them dozens of times a day, checking in their code to keep a record of the work they’ve done, so they can rewind to any point if they go too deep and screw up too many things.
Sometimes the changes pile up to the point that you can look at them all and say, “This is good. We are ready to release some new code into the world.” Maybe you do this every two weeks; maybe you do it once a year. Maybe, like Facebook, you do it all the time.
If your software was at Version 2, you could bundle up all the changes and tag the code. Behold, Version 3.
A change comes in a few seconds later from a coder far away; doesn’t matter to Version 3. You’re done with Version 3. Version 3 is part of the permanent record. You might fix some bugs and call that Version 3.1. You might add another feature and call it Version 4.
Tools such as git give programmers a common language. “Did you check that in?” they ask. “Which commit was that?” “That was going to be in 2.4, but we pushed it to 2.5.” Because each commit gets a unique identifier, you can pinpoint that commit in space and time and feel confident in the record of code changes in a way that you can rarely feel confident about anything.Δ
A side effect of this confidence is increased automation. Let’s say you have a Web server program that’s very popular and serves hundreds of millions of people every month. It runs on 50 different computers on the cloud. Aren’t you something.
Your diligent decentralized team frequently writes new code that runs on the servers. So here’s a problem: What’s the best way to get that code onto those 50 computers? Click and drag with your mouse? God, no. What are you, an animal? You set up a continuous integration server and install plug-ins and let the robots serve you.
Programmers hardly talk about code.

They chat about data. They chat about requirements and interesting approaches. And they chat constantly about deployment. Which makes sense, because that’s the goal of their work—getting their code from their brain through testing and out to the world, in Web, app, or other form. Programmers, good ones, want to ship and move on to the next nail-biting problem. So there are lots of policies, tons of them, for deploying fresh code. For example:

All programming work must happen in a branch.
When work is done, we will merge it back into the main branch; and

Run tests;
Then “push” the code over to GitHub.

At which point an automated service will run; and
A service running on each of the 50 computers will “check out” the code; and
Install it, overwriting the old version;
Then stop the computer’s Web servers;
Then restart them, so the new code can load and get to work.

See, tests and version control are now the trigger for actually shipping code. If you can follow a process like this, you can release software several times a day—which in the days of shrink-wrapped software would have been folly. (Often builds were done nightly, by big “build servers,” and one would come in the next morning to get the score.) But now that software can be released via the Web or an app store, why wait? Why not continually release software, every day, whenever you have something that’s ready to go?