Database platform upgrade

As you may have seen through the Xero in-app notifications, we’ve scheduled an outage at 5am Monday 4 July 2011 NZT (click here to find out what time it is for you) to undertake some hosting infrastructure maintenance.

We wanted to share a little more about this particular outage and some further infrastructure upgrades we’re planning in the coming weeks as we continue to expand our platform to accommodate our rapid customer growth.

During July, Xero had two short unscheduled outages. Our investigations showed both incidents were caused by a form of ‘race condition’ or deadlock in our Microsoft  SQL Server database layer, which caused requests to the database to lock up. We escalated the issue to Microsoft Premier Support, through our hosting partner Rackspace, which identified a software patch for this particular deadlock issue. We’ve successfully tested this fix through our dev and staging environments and we’re now ready to roll this into our production database environment.

As we use redundant database clusters in the production environment, it may have been possible to make these changes without an outage. However, due to the nature of the fix, we’ve chosen to take the more cautious approach and schedule an outage at a low usage time to ensure the fix is applied without any problems or risk to customer data.

We’re pleased to have identified the cause of these two recent issues. It’s disappointing to have any unscheduled outages, but to put these in context we’ve maintained a 99.99% service availability since we launched Xero more than four years ago.

Looking ahead we have some other big platform changes happening though July. We’re replacing our current database server hardware with new server hardware, increasing redundancy with additional active cluster nodes and all our production storage is being migrated to dedicated SAN storage. The new hardware has twice the capacity of our current database platform and is also an important step towards our horizontal scale out strategy.

While we can make much of this move without any scheduled outages, we’re going to need to take Xero Personal offline for a couple of hours next weekend and then all of the Xero apps in late July, again for around two hours. Of course you’ll get an app notification with more details on these scheduled outages several days before they happen.

These changes to the hosting platform will step us up another level in scalability so we can continue to accommodate the accelerating growth we are seeing in customers and transactions on the Xero platform.

Update: The database upgrade was completed in 55 minutes with a further 10 minutes of testing.  Thanks for your patience.

Leave a Reply

Your email address will not be published. Required fields are marked *

Xero Gravity: How technology maximizes business productivity

Technology enables freedom. But for many of us who are creatures of habit, it’s easy to accept traditional workflow processes in business. Nick Pasquarosa, founder and CEO of Bookkeeper 360, joins us this week on Xero Gravity. He’s on a mission to empower small business owners to upgrade their daily processes through technology. “When you invest ...

Is wearable technology set to disrupt health care?

According to Medical Daily, wearable technology was predicted to be the top fitness trend of 2016. But is wearable technology just a trend or does it have the potential to transform the medical industry? Are doctors the next sector to experience a major disruption at the hands of tech? Wearables are already making their mark ...