In an Australian first, Xero Small Business Insights shows more small-to-medium businesses are being paid on time, but average payment times still exceed the payment terms. What else can a deep dive into Xero’s data tell us?
Delving into Xero’s data with advanced analytics techniques shows that the average number of days for an invoice (with 30-day terms) to be paid fell from 39.6 days in June 2016, to 36.2 days in June 2017. This metric is referred to as “debtor days.” Late payments affect cash flow, which makes payments a vital topic for SMBs. This has recently prompted an inquiry by the Australian Small Business and Family Enterprise Ombudsman into the issue.
Xero’s extensive data set provides exciting opportunities to further analyse and uncover insights on a range of business metrics, using state-of-the-art data science. In this example, we show what is possible by applying a hierarchical Bayesian approach to analyse debtor days differences between states.
In with the old techniques
The concept of Bayesian statistics is not new, with origins dating back to the 1700’s. However, it is only in recent years that the advent of cheaper and more powerful computing power has made it feasible to apply Bayesian statistics to more complex models.
Put simply, Bayesian statistics is about viewing the world as a set of beliefs, where the plausibility of each belief is described using probabilities. Compare this to the commonly used frequentist” approach, where hypotheses are rejected, or “falsified,” based on an agreed threshold.
The graph below shows one of the outputs of the debtor days model. The best way to think of this model is like a traditional one-way Analysis of Variance (ANOVA). However unlike a traditional ANOVA, the Bayesian approach brings a number of advantages:
- Firstly, we are not constrained by the requirement of equal variances across groups, which is often not a practical assumption in the real world – data doesn’t usually obey textbook assumptions. A Bayesian analysis allows us to place a hierarchical prior on the standard deviation parameters of each group, so that lower level parameters in the hierarchy inform higher level parameters, and vice versa.
- Secondly, we can use a t-distributed noise function in our model, meaning that it is robust to the outliers frequently encountered in reality.
- Finally, a Bayesian approach avoids the use of arbitrary p-values and replaces them with posterior distributions of effects and contrasts that provide detailed descriptions of uncertainty.
The model outputs are superimposed over 12-months of debtor day figures for each state, with each red circle representing a monthly debtor days figure. Each blue curve is a posterior predictive distribution for a group (state or territory), derived from a Markov Chain Monte Carlo (MCMC) analysis. MCMC methods make it possible to construct hierarchical models which can involve computing the integrals of many unknown parameters.
The shape of a group’s curve alone provides rich information on the uncertainty of its debtor days. The broad range of curves on Northern Territory, for example, indicates greater uncertainty surrounding both the mean and standard deviation of NT debtor days. That may reflect a smaller number of NT businesses from which to derive metrics; the state contains just 1 percent of Australia’s population, the lowest share of any state or territory.
A Xero Small Business Insights deep dive
Without inspecting the outputs and their differences quantitatively, a number of interesting things are apparent, including:
- Are the lower debtor days of Queensland and Tasmania affected by their heavy tourist-based industries? Tasmania has a population of about a half million, and it sees twice that number of visitors each year. Queensland hosts four times as many tourists as there are residents. Or is there something more at work?
- Why does South Australia have similar payment times to economically stronger states like Victoria and New South Wales, whose output is more than three times that of the Wine State? Is there a relationship between the size of an economy and payment times?
Xero Small Business Insights raises many other questions to explore through deeper analysis. Do these figures vary when split across industry and business size? Are there particular industries that are improving more than others?
Stay tuned for more insights over the coming weeks as we dig deeper into this unique data set.