In just a few short weeks we will be moving GitHub to a new home at Rackspace. We’re aware of the current stability and performance issues, and we want to let you know what we’re doing about it. After all, we’re GitHub users too! The move to Rackspace will bring about a new backend architecture and a lot more servers, leading to a much improved user experience for everyone. Thanks for sticking with us through our growing pains!
Since we have a highly technical audience, I wanted to share some background on the reasons behind the move, what we’ve been doing to prepare for the big changes ahead, and what kinds of service improvements you can look forward to seeing on the new infrastructure.
As you may know, we’ve had a hosting partnership with Engine Yard since very early in the GitHub story. Engine Yard is dedicated to supporting open source initiatives and saw value in helping us grow the site to foster innovation within the Ruby community. For their tireless support and expertise, we are extremely grateful. We wouldn’t be where we are now without them.
The decision to move hosts is never an easy one. The logistics of migrating a site as large and complex as GitHub are intimidating. The single most important reason we’re undertaking this effort is so that we can give you, our customer, a better experience on GitHub. We’re growing at a rate of over 400 new users and 1000 new repositories every day and these rates are only increasing with time. We need to take drastic action now to put in place the kind of infrastructure that will allow us to provide you with a top-notch user experience.
In making the decision to move hosts, we put together a set of requirements that would be necessary to ensure the viability of our business over the next ten years.
- Price. In order to keep ahead of the traffic curve, we need to have immediate access to affordable, commodity hardware. Within five years, it’s not hard to imagine that a cluster of 100 or more servers will be necessary to keep GitHub running smoothly. To guarantee a sustainable business, this amount of hardware must not be prohibitively expensive.
- Flexibility. We’ve grown to a size where it no longer makes sense to have every server virtualized. The benefits of running bare metal are obvious and have been empirically proven. We need to have the option to run bare metal when it is appropriate to the task at hand. We also need to be able to configure boxes with custom setups. If we need six large hard drives in a certain class of machine, then we must be able to get that. If we need boxes with 32GB or 64GB of RAM, those must be available.
- Capacity. It is undesirable to be the biggest fish in the pond. Our host must have experience with sites that are several orders of magnitude larger than us. We must feel comfortable knowing that all of the scalability requirements that we will encounter over the next ten years will be tractable on an available, battle-tested infrastructure.
- Control. Having direct access (via DRAC or similar) to the actual hardware means we can control every aspect of each server’s setup, from network layout and burn-in tests to operating system and RAID configuration. When system-level problems arise, we must be in a position to fix them without the need for outside intervention. At the end of the day, we should be responsible for as much of our stack as financially feasible.
- Globalization. Our long term plan involves making GitHub faster for our international customers. Our host should have data centers in Europe and Asia so that we don’t have to look outside our primary provider to provision hardware around the world.
- Cloud. On-demand access to a cloud infrastructure will be important to us as we increase the number and variety of low-frequency but long-running jobs that we process. A provider that has a first-class cloud offering would be ideal for keeping latencies low and pricing simple.
- Trust. Our host should be a big-name player in the hosting field with an excellent reputation and multiple recommendations from other large sites. The entire future of our company rests on making GitHub stable, fast, and efficient. It is essential that our host be able to keep up with our exacting standards and provide us with competent service.
After evaluating our options, it became evident that Rackspace was the right choice for our ongoing hosting needs. They meet or exceed every one of our requirements and are the only large provider with a strong offering in both traditional and cloud services. In addition, we’ve arranged a partnership deal with Rackspace that includes discounted hardware that will allow us to bring more machines online faster than would otherwise be possible. It was important to us that this partnership not create any conflict of interest, so we’ll be true paying customers of Rackspace, just with mutually beneficial opportunities that will help keep our plans at their current low prices.
To give you a concrete idea of what this new partnership means to you, consider this: on Engine Yard we currently have the following resources (not including DB and CORAID which are out of our control):
- 10 VMs
- 39 VCPUs
- 54GB RAM
On Rackspace, we’ll be enjoying the following setup:
- 16 physical machines
- 128 physical cores
- 288GB RAM
I think the specifications speak for themselves! Within the new hardware layout, we’re placing a significant importance on high availability and redundancy. On Rackspace, every piece of our infrastructure will have failover. That means two database servers, four web servers, two GitHub Pages instances, two Gem Server instances, two Archive Download instances, distributed Job runners, three pairs of file servers, and plenty more.
Speaking of file servers, the move to Rackspace means we’ll finally be leaving our shared file system behind. We’ve far exceeded the normal IO tolerances of GFS and it has become the source of many of the problems in our stack. It has also prevented us from adding additional hardware to the site for nearly a year. Tremendous kudos go to Chris Wanstrath for his ceaseless and amazing work over the last six months to optimize the site enough that it stays running (hurray for Memcache and Redis!).
Since April I’ve been working on a brand new federated backend architecture that will allow us to store repositories on commodity file servers. When we need more storage capacity, we merely have to add more machines and update a routing table. The file servers expose an RPC interface to the Git repositories that can be accessed from anywhere in the cluster. This will allow us to horizontally and separately scale the frontend, backend, and other pieces of the infrastructure.
There are too many improvements in too many parts of our process and infrastructure to cover in this already lengthy post. I’ll happily dive into the specifics of the new architecture and other logistics in a series of follow up articles over the coming weeks.
Right now we’re putting the finishing touches on the production Rackspace cluster and working on the big repository data migration. I’ll be keeping you updated on the progress and the countdown to the final move. We’re aiming to restrict downtime to the day of the actual move and limit service interruption as much as possible.
It takes a lot of effort to build, maintain, and host a site like GitHub. I’d like to thank Engine Yard for getting us to where we are, Rackspace for helping us get to where we’re going, and you for making GitHub such an amazing project to work on.


Sounds great!
When are you planning to do the migration? I assume there'll be a lot of forewarning? :)
Good luck with the move! Nice evaluation — here's to hope that you guys make technical writeups after the transition.
Congrats on this 'good problem to have'! Keep up the great work!
Very good to hear this is finally happening - the service outage / degradation tweets have been coming on just a bit too much! Look forward to silky smooth github in the near future.
Good luck with your migration guys. I had to migrate our site from one host to another about 18 months ago that that was definitely a "fun" challenge. And Congrats on your success!
Peer
Show your racks!
Great post - I love that you guys are so transparent.
Cool post. Have fun w/ the move.
great news. count on my patience for the migration. congrats.
This is great to hear - it's been pretty slow recently and I'd been considering jumping ship. That would have been painful indeed. Now that I know what you're planning I'll stick with you guys for a while longer :)
Github is something I recommend to people every day. I love it, and will love it all the more when it's back up to full speed.
Looking forward to the upgrade.
Liked the post, looking forward to the changes.
Same here. I was thinking of hosting my own Git server so I get the benefit of being the only man on the server (plus all the headaches).
Glad to see things will get faster around here.
Godspeed!
Good luck on the move Team Github!
Posts like this and the deployment ones Chris has done in the past have really brought light to issues a lot of us who are just getting into the SaaS business could potentially run into and how to deal with them. I'm not only grateful for this service, but also the level of transparency that you've shared with us. :)
Awesome, hope to see more updates as the migration proceeds.
You still going with the DRBD + Heartbeat setup?
Awesome! Congrats! You are staying true to your company name :)
Good luck and big thanks!
Hat tip to you guys for being proactive about this. Current performance of the site is good and you've ensured that things will stay that way.
github having a 10 year vision, I am impressed
sourceforge is a decade old and is now reinventing itself
keep up the good work and hope the uptime increases on the new host, and take our goodwill to the next level
Cheers!
Thank you for doing this, I've got to the point where I don't check my github messages because the page loads take way too long. I get quite a few messages a day, so having to wait 10 seconds for the page load took way too long. I'm glad you are addressing the issue.
Great. Looking forward to performance improvements. 2 years ago we were moving Wikidot.com from a hosting in Germany to SoftLayer.com (US), managed to do everything live, with only a few seconds of downtime for switchover, but it was at least kind of complex. At that time we had about 100k users already and quite large traffic.
Good luck!
We will give you guys warning once we know a date. Especially since users with a TLD that point to Pages will need time to change their DNS records (since a TLD can't use a CNAME).
I hope you will comment (now or in a future post) on what, if anything, you are using to manage your new dedicated servers.
Thanks for the update and we all appreciate the service.
Cheers.
Awesome post guys, we're really looking forward to soaking up all of those fancy disks and cpus with rails commits.
Good to see the details on this finally out in the open. I'm eagerly awaiting the results of this move... rock on.
Godspeed, I say. Whatever the business details were, the fact is that the current hosting arrangement couldn't handle the load now, much less 12-18 months out. I look forward to worrying about GitHub's uptime less and less... :-)
Excellent post, thanks for the transparency and trust.
Great to get some technical insight in your hosting situation. Hope to see more in depth stuff later on as well :)
Great! Though I've been one of the many who has complained about the speeds recently, I really appreciate the work you guys have put into it (especially defunkt), and all the nice sharing of stories you do. The more a company tells me about what's going on, the more I trust them to take care of me.
Good news everyone!
https://www.youtube.com/watch?v=1D1cap6yETA
Great stuff, I'd like even more details.
Excellent write-up. Good luck with the move!
Awesome, now I can use GitHub without having to feel like I should do something else in parallell to make use of the time :P
oh yea! We've had managed hosting at rackspace for 4 years now. A solid environment and great service. Highly recommended. Good luck with your migration!
Really looking foward to this. Best of luck with the move.
Congratulations! What a sweet "problem" to have :). I am grateful for the openness of GitHub and I am looking forward to a post on "upgrade experience" so that we can learn too! Best of luck and I will be patient during the upgrade process :).
Any possibility of getting dollars and cents numbers in a post? That would be really helpful for the community.
test
github is awesome, and rackspace is awesome (so is 21st Amendment, where i hear github was built and where i've met many rackspace people). this makes a lot of sense on both sides, since rackspace is an excellent host, and github is an excellent breeding ground for open source software (which rackspace is a part of at https://github.com/rackspace)
Good luck on the move. Moving servers is a pretty big undertaking, so I hope it goes without too much problems.
I look forward to a more snappy website and possibly faster pushing and pulling!
Rackspace is hardly cheap or flexible for the average client. I hope you got an amazing deal and not there usual "here we will mark this Dell down to $600 a month" sort of nonsense.
Look at there public financials and stated goals sometime, even getting the equipment for free because of GitHub's imagine in certain circles you are doing Rackspace a HUGE favor.
Try to not schedule any important issues during their shift changes, they hate that - plus you have to bring the replacement up to speed on the problem.
thx & fingers crossed
Peter
Good post - thanks. My employer is just moving our hosted customer systems to Rackspace for all of the reasons you cite, so it's good to know that we're not alone!
Awesome!
Since Rackspace doesn't support Ruby, could you explain what exactly Rackspace will be helping you with in terms of scaling and performance?
@dmillar: faster hardware, and rapidly added additional or improved hardware as required.
I'd be interested by some technical details about the backend and RPC apis you expose to GIT repos. Is there anything like a NameNode in that architecture ? it sounds a bit like NFS in your article ...
commodity file servers + RPC is almost sure NFS
Technical details of the new architecture: https://www.anchor.com.au/blog/2009/09/github-designing-success/
when will ssh be available once more?