In May, we let you know about the improvements we’re making to our infrastructure platform to ensure it’s ready for the next phase of our growth. We wanted to share why we’re making these changes and how the new platform will work, starting with a bit of history about the way Xero’s platform has evolved. By ‘platform,’ we mean all of the servers, network, storage and so on that we use to run Xero’s applications. The changes to Xero’s platform structure discussed in this post are happening at the same time as we migrate to Amazon Web Services.
The early days
In the early days of Xero, we used a single database to store all our customer information from financial data, to billing, bank feeds and user records. This database was easy to work with and we made sure it was resilient, but as we grew it struggled to handle the load generated by lots of new customers.
In recent years
To make sure our platform could handle our growth we changed the way we stored customer data to use ‘sharding’. Sharding is a common method that involves dividing customers up into groups and allocating those groups to their own dedicated database clusters. At Xero, we allocated 40,000 subscriptions to each database cluster. Whenever we needed to increase the capacity of the platform, we created a new database cluster and added new customers to it as they signed up. When a database cluster held 40,000 subscriptions, we provisioned a new cluster and repeated the process.
Sharding allowed us to smoothly expand the platform as we grew from 10,000 to 700,000 subscriptions. Although that scaling strategy was effective to get to 700,000 subscriptions, all scaling strategies have their limits, and a change in approach is needed to deal with the next phase of growth. This is the point we’ve reached at Xero – the database scaling approach we have used for the last three years is no longer sufficient for the future.
As we move to AWS, we are changing the structure of the underlying platform to ensure it can grow to handle millions of organisations. The new platform is structured around cells. Each cell contains everything needed to run the Xero application for 100,000 subscriptions: all of the networks, storage, servers, databases and application software.
In many ways, cells are the logical extension of the sharding method that has worked so well at Xero. Previously, using shards, we gave groups of customers their own database clusters, but used shared infrastructure for everything else. With cells, groups of 100,000 subscriptions run on their own dedicated infrastructure, including networks, storage, servers, databases and application software.
The central reason we’re adopting the cell model is to allow us to continue to grow the platform without compromising service or performance as we add customers. Cells provide some other important benefits as well:
- They enable smarter ways of deploying software. Using cells we can deploy new features to a subset of organisations and test two different versions of the same feature at once to see which works best.
- They limit the impact of a failure. Although it is impossible to completely avoid infrastructure failures, with cells, we can control how much impact those failures have and how quickly we respond to them.
As a customer, you probably won’t notice the change in the underlying structure of Xero’s platform but over time it will allow us to grow and support millions of subscribers.
Stay tuned for more information around the technical architecture of Xero’s new platform and why we’re making these changes.