Mathematics doesn't seem personal - numbers tend to obscure intimacy. It's hard to be passionate about 92, 5.2, or 3.14. Well, maybe not 3.14 - that's a pretty cool number.

As data scientists (or, as a colleague of mine prefers, numerical ninjas), we deal with numbers every day; they're our livelihood. However, it's surprisingly easy to forget that those numbers really do represent something. Someone's mortgage, someone's trip to the convenience store on the weekend, or even someone's life-threatening illness.

This is pretty heavy when you get down to it - there's an elephant in the room that we're still coming to terms with. As analysts, we want as much information as possible. This information, unfortunately, comes with an associated loss of privacy. The New York Times recently highlighted the power of analytics by writing about how it's possible to identify whether or someone is pregnant purely off their purchasing patterns.

That in itself doesn't surprise me - being honest, I've seen far more impressive uses of predictive modelling and statistically-based inferences. However, it does flag an important point: as an analyst, what role does ethics play in how we generate insights?

To my mind, this is a hard question. It's also one that needs to be asked - as the amount of data that's available increases, so does the potential for abuse. Medical associations have their own codes of ethics. So do accountants.

Do data scientists also need a code of ethics? And if so, what would one look like? 

What are your thoughts?

This is something the Australian Computer Society (ACS) has been on about for many years: to be a true _profession_, not just a career, you need to (a) have a governing body that creates a documented set of ethics that people can sign up to, (b) accreditation, part of which involves said signing up, and (c) government recognition and enforcement.

Hence bodies such as the AMA for medicos, who developed the AMA Code of Ethics, which is then used by the Australian Health Practitioner Regulation Agency (AHPRA) to _regulate_ doctors via registration and accreditation schemes.

You could do worse than starting with the ACS Code of Professional Conduct and Practice:
  • Steven R Clark

    Steven R Clark

    Researcher Unemployed
    Currently completing a PhD examining the possibility of 'best practice' in information governance of privacy and security issues in identification/identity management ...
  • ·
  • 4 years ago
Ethics are part of what sets a profession apart from a trade, or a hobby.

Having a code of ethical conduct is, of course, only the beginning. Educating and enforcing/reinforcing ethical conduct as a basic requirement of professional practice for any profession.

While it is tempting to dismiss ethics as either esoteric, or as an impediment, to employment, a solid ethical position regarding the purposes and the conduct of professionals in practice establishes not only good reputation - it provides a rational rationale for decision-making. If analysts are not merely to be 'guns for hire', having a grounded basis for accepting or rejecting assignments, and for designing appropriate approaches to undertaking them, provides for consistency, reliability, and repeatability.

Legality alone ought not to shape or drive decision-making or conduct: expecting The Law to cover every potential or eventual circumstance is folly. As is relying upon the whims and fashions of the day ... today's master may well be tomorrow's convict.
  • Tim Manns

    Tim Manns

    Business Consultant SAS
    I'm a Data Mining Dude!
  • ·
  • 4 years ago
A recent news article about Telstra and passing every customer's web browsing history to a third party without constent. Another blatant privacy invasion. If there is a public lawsuit on this I'd join it.
An interesting article - thanks Evan. It seems to me that if we don't address this problem, then I will inevitably be complicit in outcomes which I abhor! Where could we start to develop an appropriate code of ethics to guide us in the handling of big data, given its growing pervasiveness and the growing discovery power of our tools? Are there any universal principles (First, do no harm)?
  • Mark Aitkin

    Mark Aitkin

    Insights Analyst Pacific Micromarketing
  • ·
  • 5 years ago
Tim makes a good point, should we only analyse data that has been knowingly handed over? When a user presses OK on an application, should a big red sign popup and say your information will be used for analyses and make someone else a profit (or keep such disclosures buried in small print on the outskirts or Timbuktu).
Obviously the application of the analytics matters - thats a given, but there are standards that apply during the R&D phase as well - National Privacy Principles are good case in point about data being used in certain ways. For example when building a model (as oppsed to deploying) it is it OK to include a discriminatory discriminating feature to see whether it has an effect?
  • Craig McDonald

    Craig McDonald

    Professor of Informatics University of Canberra
  • ·
  • 5 years ago
Most professional societies have codes of ethics, codes of conduct, etc. (see Professions Australia, or the ACS).

It's a useful process to create such things, as they do engage a group in thinking about what's important, beyond the flexing of personal skill and creativity. And codes can help new people to see the risks in getting carried away (there is even a code for currency market traders(

What should be in it? One fairly unique aspect of analytics is that it creates meaning. Take the MySchool system for example. It created a new version of what 'educational quality' means, one based in aggregating individual kid's test scores. What aggregations mean (ontologically) is quite different from what the individual measures mean. So some schools doing great individual work with difficult kids were pilloried as being low quality. Harm was done through the misuse of data. It was predictable. Analytics created the harm. Analysts need to take responsibility for the truthfulness of the data they handle and for the impacts of their results. "I was just doing what I was told" is nowhere near good enough.

A code of ethical analytics should perhaps start, and possibly end, with veracity.
  • Tim Manns

    Tim Manns

    Business Consultant SAS
    I'm a Data Mining Dude!
  • ·
  • 5 years ago
My personal opinions follow: of course....

Currently it is extremely worthwhile to have absolutely no ethics at all when applying data analytics. The more advanced, bigger, and unethical the better.

I'm still waiting to read about a high profile law suit. Once a fine is given that makes a dent in the pocket of someone rich will the growing corruption within analytics begin to stall. And yes I do think it is that bad, and no it isn't just in the US (although they are pretty good at it).

As a recent example of (in my opinion) highly illegal and intentful poor ethical behaviour we saw the development of an iPhone app that steals all your phone numbers and contact list. Ingenious! Unsurprisingly this guy (Dave Morin) is *really* rich, just not very honest.

Here is pause for thought, Dave Morin, the CEO of PATH was amongst the original Facebook developer team. Just how safe do you think your Facebook data really is?
Made me laugh…

Oh and of course we have the GCF. The main thing we learn is just don't get caught when fudging the numbers.....

Comment on this article

You must login or signup to comment on articles