What if I told you ...
... that you could have a record of every change made to the many, many documents that go into your codebase ...
Well that’s interesting. I mean, I could use that for all kinds of
things. I could use that to find out where bugs entered the system, for
example.
... and a record of who made them, up to the minute, a permanent record ...
Well that is powerful. I could use it to review the progress my team was
making, if I was a manager, by looking at the changes every day.
.. and every change could be reviewed by anyone, in a totally transparent way ...
Everyone can keep up with all the changes and understand how the code is evolving? Every change?
.. and you can bundle changes and turn them
into branches, and anyone can make as many branches as needed, without
violating the integrity of the other branches ...
I’d say this sounds like some insane parallel-universe fantasy. Someone can completely change the code without interfering with the work of others?
... and then you could merge a finished branch back into the main trunk of code, reviewing and fixing inconsistencies as you go ...
My
God, it’s like you can hear inside my brain. So even though my code is a
huge pile of fragile, interdependent components, I can have my code
team off working in their own branches and then, because I have spent
the time to have a solid testing suite, we can, at the appropriate time,
merge their changes and run automated tests to make sure that
everything is still working.
... and everyone can have the history of every change ever made to the code, even if the codebase is decades old ...
... and it’s all completely, totally free to download and is the default way of distributing source code throughout the world!
... and it’s all completely, totally free to download and is the default way of distributing source code throughout the world!
(Faints.)
And that’s why everyone gets excited about GitHub. You should go to GitHub, you really should. You should poke around and look through the thousands of repositories there, read some of the README files. And you should look into the code, and then look at the commits. A “commit” is a moment of action captured and stored. You can compare one commit with another and see a “diff,” see what’s been added and what’s been removed. See what you can figure out. Take a look at the screen shot below.
First, we’re looking at the Django repository. This is the actual, real-life code that makes Django, the Web framework, run. It has 668 people keeping an eye on it, and 14,325 people have starred it as a favorite, and there are 5,692 forks—meaning that people have copied the code into their own repositories with some intention of manipulating and adding to or changing it. These numbers represent invested users. There are likely hundreds of thousands more who downloaded the code just to use it.
We see that a user, claudep, has checked in some code. He did this five hours ago, adding a “commit message” that reads “Fixed #24826—Accounted for filesystem-dependent filename max length.” He’s working in a file called tests.py, which means that this particular new code (marked in green and prefixed by “+” at the beginning of each line) is probably either test code or code to support tests. And thanks to user claudep, this code is now better than it was six hours ago.
This is the experience of using version control. It’s a combination news feed and backup system. GitHub didn’t invent version control. It took a program called git,
which had been developed at first by Linus Torvalds, the chief architect of Linux, and started adding tools and services around it. The way git works is that you can copy the code and all the changes ever made to the code with one command:
Sometimes the changes pile up to the point that you can look at them all and say, “This is good. We are ready to release some new code into the world.” Maybe you do this every two weeks; maybe you do it once a year. Maybe, like Facebook, you do it all the time.
If your software was at Version 2, you could bundle up all the changes and tag the code. Behold, Version 3.
A change comes in a few seconds later from a coder far away; doesn’t matter to Version 3. You’re done with Version 3. Version 3 is part of the permanent record. You might fix some bugs and call that Version 3.1. You might add another feature and call it Version 4.
Tools such as git give programmers a common language. “Did you check that in?” they ask. “Which commit was that?” “That was going to be in 2.4, but we pushed it to 2.5.” Because each commit gets a unique identifier, you can pinpoint that commit in space and time and feel confident in the record of code changes in a way that you can rarely feel confident about anything.Δ
A side effect of this confidence is increased automation. Let’s say you have a Web server program that’s very popular and serves hundreds of millions of people every month. It runs on 50 different computers on the cloud. Aren’t you something.
Your diligent decentralized team frequently writes new code that runs on the servers. So here’s a problem: What’s the best way to get that code onto those 50 computers? Click and drag with your mouse? God, no. What are you, an animal? You set up a continuous integration server and install plug-ins and let the robots serve you.
Programmers hardly talk about code.
They chat about data. They chat about requirements and interesting approaches. And they chat constantly about deployment. Which makes sense, because that’s the goal of their work—getting their code from their brain through testing and out to the world, in Web, app, or other form. Programmers, good ones, want to ship and move on to the next nail-biting problem. So there are lots of policies, tons of them, for deploying fresh code. For example:
... and it’s all completely, totally free to download and is the default way of distributing source code throughout the world!
(Faints.)
And that’s why everyone gets excited about GitHub. You should go to GitHub, you really should. You should poke around and look through the thousands of repositories there, read some of the README files. And you should look into the code, and then look at the commits. A “commit” is a moment of action captured and stored. You can compare one commit with another and see a “diff,” see what’s been added and what’s been removed. See what you can figure out. Take a look at the screen shot below.
First, we’re looking at the Django repository. This is the actual, real-life code that makes Django, the Web framework, run. It has 668 people keeping an eye on it, and 14,325 people have starred it as a favorite, and there are 5,692 forks—meaning that people have copied the code into their own repositories with some intention of manipulating and adding to or changing it. These numbers represent invested users. There are likely hundreds of thousands more who downloaded the code just to use it.
We see that a user, claudep, has checked in some code. He did this five hours ago, adding a “commit message” that reads “Fixed #24826—Accounted for filesystem-dependent filename max length.” He’s working in a file called tests.py, which means that this particular new code (marked in green and prefixed by “+” at the beginning of each line) is probably either test code or code to support tests. And thanks to user claudep, this code is now better than it was six hours ago.
This is the experience of using version control. It’s a combination news feed and backup system. GitHub didn’t invent version control. It took a program called git,
which had been developed at first by Linus Torvalds, the chief architect of Linux, and started adding tools and services around it. The way git works is that you can copy the code and all the changes ever made to the code with one command:
git clone git@github.com:nodejs/node.git
That will copy all of the code that is and was in Node.js to
your local machine. Now you can go in and change that code to your
heart’s delight. When you’re done changing it, you can type git add .
which adds the files that you changed; and thengit commit
which asks you to enter a commit message explaining what you’ve done; and thengit push origin master
which will cause an error because who do you think you are to
come in and start pushing code to the node repository? But if you did
have permission, that would push your changes to the master branch of
the git repository that is hosted on GitHub.
Github Repositories
These commands are now part of the sense memory of many
programmers. They type variations on them dozens of times a day,
checking in their code to keep a record of the work they’ve done, so
they can rewind to any point if they go too deep and screw up too many
things.Sometimes the changes pile up to the point that you can look at them all and say, “This is good. We are ready to release some new code into the world.” Maybe you do this every two weeks; maybe you do it once a year. Maybe, like Facebook, you do it all the time.
If your software was at Version 2, you could bundle up all the changes and tag the code. Behold, Version 3.
A change comes in a few seconds later from a coder far away; doesn’t matter to Version 3. You’re done with Version 3. Version 3 is part of the permanent record. You might fix some bugs and call that Version 3.1. You might add another feature and call it Version 4.
Tools such as git give programmers a common language. “Did you check that in?” they ask. “Which commit was that?” “That was going to be in 2.4, but we pushed it to 2.5.” Because each commit gets a unique identifier, you can pinpoint that commit in space and time and feel confident in the record of code changes in a way that you can rarely feel confident about anything.Δ
A side effect of this confidence is increased automation. Let’s say you have a Web server program that’s very popular and serves hundreds of millions of people every month. It runs on 50 different computers on the cloud. Aren’t you something.
Your diligent decentralized team frequently writes new code that runs on the servers. So here’s a problem: What’s the best way to get that code onto those 50 computers? Click and drag with your mouse? God, no. What are you, an animal? You set up a continuous integration server and install plug-ins and let the robots serve you.
Programmers hardly talk about code.
They chat about data. They chat about requirements and interesting approaches. And they chat constantly about deployment. Which makes sense, because that’s the goal of their work—getting their code from their brain through testing and out to the world, in Web, app, or other form. Programmers, good ones, want to ship and move on to the next nail-biting problem. So there are lots of policies, tons of them, for deploying fresh code. For example:
- All programming work must happen in a branch.
- When work is done, we will merge it back into the main branch; and
- Run tests;
- Then “push” the code over to GitHub.
- At which point an automated service will run; and
- A service running on each of the 50 computers will “check out” the code; and
- Install it, overwriting the old version;
- Then stop the computer’s Web servers;
- Then restart them, so the new code can load and get to work.
No comments:
Post a Comment