Now that things have settled down from the move to Rackspace, I wanted to take some time to go over the architectural changes that we’ve made in order to bring you a speedier, more scalable GitHub.
In my first draft of this article I spent a lot of time explaining why we made each of the technology choices that we did. After a while, however, it became difficult to separate the architecture from the discourse and the whole thing became confusing. So I’ve decided to simply explain the architecture and then write a series of follow up posts with more detailed analyses of exactly why we made the choices we did.
There are many ways to scale modern web applications. What I will be describing here is the method that we chose. This should by no means be considered the only way to scale an application. Consider it a case study of what worked for us given our unique requirements.
Understanding the Protocols
We expose three primary protocols to end users of GitHub: HTTP, SSH, and Git. When browsing the site with your favorite browser, you’re using HTTP. When you clone, pull, or push to a private URL like git@github.com:mojombo/jekyll.git you’re doing so via SSH. When you clone or pull from a public repository via a URL like git://github.com/mojombo/jekyll.git you’re using the Git protocol.
The easiest way to understand the architecture is by tracing how each of these requests propagates through the system.
Tracing an HTTP Request
For this example I’ll show you how a request for a tree page such as https://github.com/mojombo/jekyll happens.
The first thing your request hits after coming down from the internet is the active load balancer. For this task we use a pair of Xen instances running ldirectord. These are called lb1a and lb1b. At any given time one of these is active and the other is waiting to take over in case of a failure in the master. The load balancer doesn’t do anything fancy. It forwards TCP packets to various servers based on the requested IP and port and can remove misbehaving servers from the balance pool if necessary. In the event that no servers are available for a given pool it can serve a simple static site instead of refusing connections.
For requests to the main website, the load balancer ships your request off to one of the four frontend machines. Each of these is an 8 core, 16GB RAM bare metal server. Their names are fe1, …, fe4. Nginx accepts the connection and sends it to a Unix domain socket upon which sixteen Unicorn worker processes are selecting. One of these workers grabs the request and runs the Rails code necessary to fulfill it.
Many pages require database lookups. Our MySQL database runs on two 8 core, 32GB RAM bare metal servers with 15k RPM SAS drives. Their names are db1a and db1b. At any given time, one of them is master and one is slave. MySQL replication is accomplished via DRBD.
If the page requires information about a Git repository and that data is not cached, then it will use our Grit library to retrieve the data. In order to accommodate our Rackspace setup, we’ve modified Grit to do something special. We start by abstracting out every call that needs access to the filesystem into the Grit::Git object. We then replace Grit::Git with a stub that makes RPC calls to our Smoke service. Smoke has direct disk access to the repositories and essentially presents Grit::Git as a service. It’s called Smoke because Smoke is just Grit in the cloud. Get it?
The stubbed Grit makes RPC calls to smoke which is a load balanced hostname that maps back to the fe machines. Each frontend runs four ProxyMachine instances behind HAProxy that act as routing proxies for Smoke calls. ProxyMachine is my content aware (layer 7) TCP routing proxy that lets us write the routing logic in Ruby. The proxy examines the request and extracts the username of the repository that has been specified. We then use a proprietary library called Chimney (it routes the smoke!) to lookup the route for that user. A user’s route is simply the hostname of the file server on which that user’s repositories are kept.
Chimney finds the route by making a call to Redis. Redis runs on the database servers. We use Redis as a persistent key/value store for the routing information and a variety of other data.
Once the Smoke proxy has determined the user’s route, it establishes a transparent proxy to the proper file server. We have four pairs of fileservers. Their names are fs1a, fs1b, …, fs4a, fs4b. These are 8 core, 16GB RAM bare metal servers, each with six 300GB 15K RPM SAS drives arranged in RAID 10. At any given time one server in each pair is active and the other is waiting to take over should there be a fatal failure in the master. All repository data is constantly replicated from the master to the slave via DRBD.
Every file server runs two Ernie RPC servers behind HAProxy. Each Ernie spawns 15 Ruby workers. These workers take the RPC call and reconstitute and perform the Grit call. The response is sent back through the Smoke proxy to the Rails app where the Grit stub returns the expected Grit response.
When Unicorn is finished with the Rails action, the response is sent back through Nginx and directly to the client (outgoing responses do not go back through the load balancer).
Finally, you see a pretty web page!
The above flow is what happens when there are no cache hits. In many cases the Rails code uses Evan Weaver’s Ruby memcached client to query the Memcache servers that run on each slave file server. Since these machines are otherwise idle, we place 12GB of Memcache on each. These servers are aliased as memcache1, …, memcache4.
BERT and BERT-RPC
For our data serialization and RPC protocol we are using BERT and BERT-RPC. You haven’t heard of them before because they’re brand new. I invented them because I was not satisfied with any of the available options that I evaluated, and I wanted to experiment with an idea that I’ve had for a while. Before you freak out about NIH syndrome (or to help you refine your freak out), please read my accompanying article Introducing BERT and BERT-RPC about how these technologies came to be and what I intend for them to solve.
If you’d rather just check out the spec, head over to https://bert-rpc.org.
For the code hungry, check out my Ruby BERT serialization library BERT, my Ruby BERT-RPC client BERTRPC, and my Erlang/Ruby hybrid BERT-RPC server Ernie. These are the exact libraries we use at GitHub to serve up all repository data.
Tracing an SSH Request
Git uses SSH for encrypted communications between you and the server. In order to understand how our architecture deals with SSH connections, it is first important to understand how this works in a simpler setup.
Git relies on the fact that SSH allows you to execute commands on a remote server. For instance, the command ssh tom@frost ls -al runs ls -al in the home directory of my user on the frost server. I get the output of the command on my local terminal. SSH is essentially hooking up the STDIN, STDOUT, and STDERR of the remote machine to my local terminal.
If you run a command like git clone tom@frost:mojombo/bert, what Git is doing behind the scenes is SSHing to frost, authenticating as the tom user, and then remotely executing git upload-pack mojombo/bert. Now your client can talk to that process on the remote server by simply reading and writing over the SSH connection. Neat, huh?
Of course, allowing arbitrary execution of commands is unsafe, so SSH includes the ability to restrict what commands can be executed. In a very simple case, you can restrict execution to git-shell which is included with Git. All this script does is check the command that you’re trying to execute and ensure that it’s one of git upload-pack, git receive-pack, or git upload-archive. If it is indeed one of those, it uses exec to replace the current process with that new process. After that, it’s as if you had just executed that command directly.
So, now that you know how Git’s SSH operations work in a simple case, let me show you how we handle this in GitHub’s architecture.
First, your Git client initiates an SSH session. The connection comes down off the internet and hits our load balancer.
From there, the connection is sent to one of the frontends where SSHD accepts it. We have patched our SSH daemon to perform public key lookups from our MySQL database. Your key identifies your GitHub user and this information is sent along with the original command and arguments to our proprietary script called Gerve (Git sERVE). Think of Gerve as a super smart version of git-shell.
Gerve verifies that your user has access to the repository specified in the arguments. If you are the owner of the repository, no database lookups need to be performed, otherwise several SQL queries are made to determine permissions.
Once access has been verified, Gerve uses Chimney to look up the route for the owner of the repository. The goal now is to execute your original command on the proper file server and hook your local machine up to that process. What better way to do this than with another remote SSH execution!
I know it sounds crazy but it works great. Gerve simply uses exec(3) to replace itself with a call tossh git@<route> <command> <arg>. After this call, your client is hooked up to a process on a frontend machine which is, in turn, hooked up to a process on a file server.
Think of it this way: after determining permissions and the location of the repository, the frontend becomes a transparent proxy for the rest of the session. The only drawback to this approach is that the internal SSH is unnecessarily encumbered by the overhead of encryption/decryption when none is strictly required. It’s possible we may replace this this internal SSH call with something more efficient, but this approach is just too damn simple (and still very fast) to make me worry about it very much.
Tracing a Git Request
Performing public clones and pulls via Git is similar to how the SSH method works. Instead of using SSH for authentication and encryption, however, it relies on a server side Git Daemon. This daemon accepts connections, verifies the command to be run, and then uses fork(2) and exec(3) to spawn a worker that then becomes the command process.
With this in mind, I’ll show you how a public clone operation works.
First, your Git client issues a request containing the command and repository name you wish to clone. This request enters our system on the load balancer.
From there, the request is sent to one of the frontends. Each frontend runs four ProxyMachine instances behind HAProxy that act as routing proxies for the Git protocol. The proxy inspects the request and extracts the username (or gist name) of the repo. It then uses Chimney to lookup the route. If there is no route or any other error is encountered, the proxy speaks the Git protocol and sends back an appropriate messages to the client. Once the route is known, the repo name (e.g. mojombo/bert) is translated into its path on disk (e.g. a/a8/e2/95/mojombo/bert.git). On our old setup that had no proxies, we had to use a modified daemon that could convert the user/repo into the correct filepath. By doing this step in the proxy, we can now use an unmodified daemon, allowing for a much easier upgrade path.
Next, the Git proxy establishes a transparent proxy with the proper file server and sends the modified request (with the converted repository path). Each file server runs two Git Daemon processes behind HAProxy. The daemon speaks the pack file protocol and streams data back through the Git proxy and directly to your Git client.
Once your client has all the data, you’ve cloned the repository and can get to work!
Sub- and Side-Systems
In addition to the primary web application and Git hosting systems, we also run a variety of other sub-systems and side-systems. Sub-systems include the job queue, archive downloads, billing, mirroring, and the svn importer. Side-systems include GitHub Pages, Gist, gem server, and a bunch of internal tools. You can look forward to explanations of how some of these work within the new architecture, and what new technologies we’ve created to help our application run more smoothly.
Conclusion
The architecture outlined here has allowed us to properly scale the site and resulted in massive performance increases across the entire site. Our average Rails response time on our previous setup was anywhere from 500ms to several seconds depending on how loaded the slices were. Moving to bare metal and federated storage on Rackspace has brought our average Rails response time to consistently under 100ms. In addition, the job queue now has no problem keeping up with the 280,000 background jobs we process every day. We still have plenty of headroom to grow with the current set of hardware, and when the time comes to add more machines, we can add new servers on any tier with ease. I’m very pleased with how well everything is working, and if you’re like me, you’re enjoying the new and improved GitHub every day!


Thanks for sharing information about your architecture!
Still, would be cool to have gem building back :-|
sweeet :D lots of stuff to go wrong ;) haha
i dont miss gem building now, it was just frustrating to move / rename everything blah blah
What an awesome write up. Very neat to see how you guys get things done.
Awesome write up. I love the openness :) And damn it, scaling web apps is hard, let's go shopping :P
Great write up. I’m really impressed with all the backend stuff you guys have going on.
Thanks for sharing. Waiting for more war stories from github data center ;).
My brain hurts
Yeah, but can it make me dinner?
Great write-up, thanks! I look forward to hearing how you monitor all these tiers and keep it running smoothly.
Awesome. Nerd porn!
@sbryant : you could probably cook your dinner on ´lb1a´ or ´lb1b´ if you were really hungry...
Thanks for sharing, its a nice example of using lots of little bits all hooked up to do something complex
Have you tried getting the ssh patches to upstream? Would be cool to get that included...
I really had to laugh, once I got to the first example code.
ernie... now that's just hilarious :D
Can you explain some of the thinking that lead you to choose ldirectord as the load balancer (lb1a and lb1b)? Where there other options you investigated?
Thanks so much for sharing. The speed improvement is quite inspiring to those of us creating sites we'd like to speed up someday.
Loved reading through this, and look forward to what else you guys have to share.
It definitely makes me excited to see raw materials (the tools, like Redis and such) evolving into solutions that can start to see widespread use. Evolution of solutions, the industry, and technologies.
Cheers!
What happens with the memcache server on the slave file server when it takes over?
I really love GitHub but it's only fast sometimes.
I always enjoy architecture articles. Considering the nature of the site, these articles are a perfect way to connect with the community. That's one thing that impresses me about github, the community mindedness is ever present.
It sound very funny.
Thanks for sharing this, very interesting.
Great article!
May I ask the reasons why you choose ldirectord on the frontend over haproxy? I'm buliding a similar infrastructure, and would love to know more about ldirectord.
Kristóf
By the way, do you follow discussion about git-daemon caching (the 'ne/rev-cache' branch in 'pu') from the Google Summer of Code 2009 project? It is not yet ready, but this mechanism is meant to speed up the enumerating objects part of git-clone and git-fetch; because of git architecture that would work for both git:// and SSH protocols.
Very nice writeup, although I'd like to see some pretty pictures of connections between servers, and example route(s) that different requests can take. Good work!
wow. Too much to digest. Thanks for sharing this info, I am sure it will help all of us trying to scale one way or other
It looks like Rails is hurting a lot your performance, as happens with every website using Ruby. Why don't you try Wt? https://webtoolkit.eu - It's C++ and very efficient. I'd say it'll cut in half your hardware requirements.
This writeup is awesome. Thanks a lot, Tom.
Doesn't sound so hard... ;)
@pgquiles If they decide to switch to Wt (yes, I've used it), they need to COMPLETELY rewrite the frontend. Wt is more like ordinary toolkits (think Qt), so all of the view files would be needed to be written with nested classes, etc. The speed of the application may go up, but developing would take SO much more time. And besides, as far as I know, it's not Rails that was the biggest problem, but the fact that Git(Hub) is very heavy on the IO, and virtual machines doesn't handle that all too well.
You're a gifted technical writer. More of this, please. Thanks for sharing in such great detail what many other would consider business secrets.
GitHub was way too slow a few weeks ago (before the move), I was hoping you would do something about that at the time. Much better now! Browsing the site is a lot of fun again.
Tiffany Online Store provides brand new tiffany & co silver jewelry with high quality. Enjoy your tiffany jewelry shopping with quick shipping and best
tiffany jewelry,tiffany co jewelry,
Tiffany Style Silver Jewelry: Rings, Earrings, Necklaces, Bracelets and more Tiffany Jewelry at low prices
Tiffany Jewellery.tiffany co
Tiffany Rings,
Tiffany & Co Jewelry Outlet Online Store Sells Various Kinds of Cheap Tiffany's Silver Jewelry, Including Rings, Necklaces, Pendants, Earrings Tiffany Bracelets
Tiffany Necklaces