Last Sunday we migrated all of our customers, data and systems to a new primary data center in order to set the foundations for further growth. The migration involved synchronizing approximately 10 terabytes (TB) over the 1,500km between our old primary data center in Dallas and a new pre-prepared data center in Chicago. 10 TB is approximately the equivalent of the entire printed collection of the US Library of Congress.
On the day we had 34 staff in multiple offices performing data migration, database and infrastructure tasks, testing the application to ensure the migration had completed successfully, taking care of social media and customer tickets and standing by just in case we had any code issues. The team atmosphere was great with people pitching in where required from making coffees to triaging issues.
We completed the whole process about an hour early and have had a very small number of issues since we went live. For those of you that have been involved in large IT projects you will be aware that migrations are not often this smooth. The key for Xero was a lot of preparation and practice combined with a very talented and committed team.
Since the migration the Xero applications have been slightly better than 10% faster. While our goals weren’t performance related, it is good to see a material improvement.
The project to plan for, design, build and migrate to our new hosting environment started in late June 2012 with a planning trip to San Francisco to meet with our CTO and then on to San Antonio, Texas to meet with Rackspace – our primary hosting partner.
We looked at the various options open to us from Platform as a Service (PaaS) offerings such as Microsoft Azure, Amazon AWS and Rackspace OpenCloud through to building and managing our own equipment. Ultimately we settled on the Rackspace Private Cloud solution based on a number of factors that allowed us to meet timeframes, performance and capacity drivers, the need for rapid scaling and of course cost effectiveness.
We have been operating on the Rackspace Managed Hosting and Private Cloud solutions since March 2008. Rackspace offer a flexible and comprehensive solution that gives us a good balance between the benefits of our own equipment and Infrastructure as a Service.
A number of people have asked us for the technical details of our new hosting environment so the remainder of this post covers further disclosable aspects of the environment.
The new hosting environment has completely new equipment, new versions of our virtualization software, operating systems, database software and SAN hardware. Every aspect of the environment has redundancy, which means that the failure of one (or often multiple) components won’t affect our customers.
We run a 10 Gb network backbone with smaller servers and devices connected by one or two 1 Gb connections. We have F5 load balancers and some undisclosable intrusion detection systems and firewalls.
We moved our storage to an EMC VNX SAN, which has a mixture of SSD flash and traditional spinning disks. Overall we have over 100 TB of usable storage due to the need for multiple redundant copies of key data.
The servers in our new hosting environment are almost entirely virtualized on dedicated physical servers running VMWare vSphere 5.1. We have two clusters – one runs application and utility servers and the other runs some of our database servers. Most of our servers run Windows Server 2008 R2 although we have a small number running linux.
Our database layer was upgraded from Microsoft SQL Server 2008 to 2012. 2012 has a number of new features but the most important for us was AlwaysOn Availability Groups. Our old platform used Windows Clustering failover SQL instances for database resilience and we wanted something better. By comparison Availability Groups are much easier to implement and maintain and recover rapidly in the event of a failure.