Coming soon

Groups pages will be added as we roll out new features

Coming soon

Blogs will be added as we roll out new features

Coming soon

Jobs pages will be added as we roll out new features

Innovative Techniques on Big Data in Determining Credit Risk

Monday, 25th May 2015

Using transactional information from SMB clients to derive insights, segmentation, and client health.

At IAPA’s May 2015 seminar in Sydney, Tim Manns of Analytics 8 gave some interesting insights into how customers can be segmented and analysed for risk mitigation.

The basis for the talk was a project for a large bank which wanted to investigate value that could be gleaned from large amounts of transactional customer data.

With a small brief (just 25 days, no additional resource), Tim delved into an uncommon area of customer data: payment transactions by customers to the Australian Tax Office.

Since tax is paid into one of two ATO accounts which are public knowledge, a bank can view the material relationship between the customer and the tax office over time.

The input into the exercise was:

  • a two-year history for a random sample of 50,000 bank customers
  • of those 50,000, 200 customers were separately deemed to be ‘loans at risk’, several months subsequent to the chosen time window.

Two factors were focused on: a negative correlation between tax payments and bank balance – ie downward bank balances coincided with tax payments; and a positive correlation – which suggested the customer may have other accounts in another bank.

Tim used using k-means clustering on the basis of

  • the pattern of payments to the ATO (which clearly had to be normalised to a quarterly basis, as some clients paid monthly and some quarterly)
  • the relationship of ATO payments to client account balance.

The model was deliberately aimed at yielding roughly equal segments, to keep understandings straightforward for the stakeholders. The segments were then named:

  • Paybacks (21%) – SMBs with ATO refunds or declining loan balances; these ones were paying back their loans!
  • Strong Steady (20%) – large balances, large ATO payments with small growth.
  • Risky Business (17%) – large ATO payments, declining ATO payments. Declining account balances from high black. These clients either had declining business, or were moving to another bank.
  • Strong Growth (23%) -- high positive correlations between balance and ATO payments; huge ATO payment growth.
  • No Future (19%) – negative correlations between balance and ATO payments; largest reduction in payments from previous year. Low average loans and increasing loan amounts.

Of course, since the clients varied in size, payments and balances were scaled to focus on trends rather than absolute amounts.

The outcome of the exercise demonstrated there was some good predictive power for ATO payments and accounts subsequently deemed to be at risk. And in a more general sense, transactional data can be a useful, distinct source of behavioural information.

And where does Big Data come in? Our typical expectation of big data conjures up structured and unstructured, heterogeneously-sourced data of a size that calls for non-traditional approaches to number crunching, such as Hadoop. In fact, it can be a number of things, including some or all of this. Further useful discussion can be followed on Webopedia and Wikipedia.

Tim Manns had generously stepped in as a late substitute for the intended seminar, so he can be excused for not covering all possible bases. Big Data is not always a specific technology; in this case, Tim used SAS on a Teradata data platform.

Upcoming IAPA events can be found here.

Thanks to Stephen Simmonds for providing this article.

Comments are turned off