#BreaktheBias

The 10 biases you should know to Break the Bias in Analytics

Algorithms are becoming more and more a part of our daily lives, as in the case of decision support algorithms (recommendation or scoring algorithms) or autonomous algorithms embedded in intelligent machines (autonomous vehicles). As analytics is deployed across several sectors and industries for their efficiency, their results are more discussed and contested. In particular, some of the algorithms are accused of being black boxes and leading to discriminatory practices linked to gender or ethnic origin.
For those who work in or work with analytics or analytics insights, we take a moment on International Women’s Day to describe the biases related to analytics and outline ways to #BreaktheBias

Why is reducing bias important?

As the name implies, machine learning learns from the information it is fed to spot insights in data that it has never had access to before. However, one of the many challenges analytics professionals like data scientists face is ensuring that the training data is "clean”, accurate, and correctly labelled in supervised learning and free of anything that could skew the results.

A statistical learning system with decision power that relies on tainted information can cause serious problems. Bad examples caused by algorithmic biases include a Google image recognition system that misidentified images of minorities in an offensive way; automated credit applications from Goldman Sachs that triggered a gender bias investigation; and a racially biased AI program used to convict criminals.

As a result, organisations should be cautious about machine learning bias. AI's contributions to efficiency or productivity will be dashed if algorithms discriminate against individuals or minorities.

Biases in artificial intelligence models are not limited to biases against individuals. A skewed dataset can impact a company's business. Interpreting gestures, dress codes, financial results, etc., requires the consideration of many factors and contexts so that the use of the algorithm does not become a waste of computing resources.

The analytics professionals behind these algorithms should realise that it is virtually impossible to avoid bias. For cultural, economic, political, organisational, technological, data selection, and correlation reasons, a statistical learning model is a snapshot of the way of thinking that reproduced it. This invites us to take a step back to improve the relevance of analytics by embracing diversity of thought via more gender balanced teams and teams that embrace diversity.

10 Different types of biases to overcome in analytics:

1. Gender bias
As its name suggests, Gender bias favours one gender over another. In today's society, gender bias is often used to refer to the preferential treatment men receive — specifically white, heterosexual males. It's often labelled as "sexism" and describes the prejudice against women solely based on their gender.
This bias impacts data analytics teams on composition and the results they derive, driving skewed analytics and insights to potentially be "blind" to fifty percent of society.

2. Application bias
Application bias is a result of applying analytical models on production datasets that do not have identical distributions and representations of the training set. When real-life production data have populations with zero to very little information in the training set, you risk having some very skewed insights.

3. Confirmation bias.
Confirmation bias originates from selecting only data and information that reinforces or confirms something you already know, rather than selecting the information that might indicate something against preconceived ideas.

4. Algorithm bias
The way you designed your artificial intelligence and machine learning models is also crucial to achieving responsible and fair AI. Algorithmic bias represents systematic and repeatable errors that lead to unfair results, such as privileging one arbitrary group of users over others. Bias in algorithms can originate from unrepresentative or incomplete training data or insufficient information that incorporates historical inequalities.

5. Sampling bias.
A common error in data collection is a lack of representativeness. As a result, some items may be oversampled relative to reality. Let's take the example of a company that wants to predict breakdowns of its machines. If it collects mostly error information, the algorithm will not accurately identify the normal operation of the equipment.

6. Exclusion bias.
Like sampling bias, exclusion bias comes from inappropriately removed data from the data source. For example, when you have petabytes of data, it is tempting to select a small sample to use for training, but doing so may inadvertently exclude some data, resulting in a biased data set. Exclusion bias can also result from duplicating data when the data elements are truly distinct.

7. Survivor bias
Survivor bias is a category of selection bias that overstates the likelihood of success of an initiative by focusing attention on successful subjects who are statistical exceptions rather than representative cases.

8. Reporting bias
Reporting bias is a distortion of presented data from research due to the selective disclosure or withholding of information by groups involved in the topic selected for study and the design, conduct, analysis, or dissemination of study methods, findings, or both.

9. Group attribution bias
Group attribution error is believing that an individual's characteristics always follow the beliefs of a group that they belong to or that a group's decisions reflect the feelings of all group members. This type of bias drives individuals to prefer and collaborate with those who share similar characteristics or backgrounds in the analytics professionals. Thus, excluding other team members.

10. Racial bias
Racial bias originates from considering factors that disproportionately impact a race or ethnic group as a proxy or key variable in other data analysis or algorithms.
It is essential to distinguish racial bias from racism or discrimination. Implicit biases are individuals' unconscious associations. This means that the person is likely not aware of the biased association, which can heavily impact the way data is collected, the models, and the interpretation of results.

4 steps to encourage a gender-balanced team:

The following are a few suggestions to foster inclusion and diversity in your workplace and ensure you retain your top female talent:

1. Identify potential sources of bias and assess your situation.
Using the list established above, it is possible to reduce the harmful effects we have described significantly. First, it is important to understand what your organisation is doing right and where it can improve when it comes to diversity and inclusion. For example, you can start by assessing the pay equity within your firm and identify any problems.

2. Putting policies in place to avoid bias
Organisations need to establish guidelines, rules, and procedures for identifying, communicating, and potentially mitigating bias. Forward-thinking organisations document instances of bias as they occur, describe the steps taken to identify the bias and explain the efforts to mitigate it. Ensure your company policies and guidelines are bias-free and take corrective actions for those that aren't. For example, parental leave policies should be the same both for men and women and embracing flexible working arrangements allows team members to manage work:life balance more holistically.

3. Create gender neutral recruitment and HR processes
Start the connection with the market and potential candidates in a gender-neutral way and continue that through the entire recruitment process
● Proactively build teams with a balanced gender
● Ensure job descriptions and job ads use gender neutral language
● De-identify resumes and job applications
● Be open to considering team members who have developed their skills outside of traditional pathways

4. Create a supportive culture
The gender gap in the workplace often occurs over three stages starting from recruitment, promotion, to retention. To reduce this gap, you need to:
● Build a culture that is respectful and supportive of all team members
● Actively seek input from all members of the team and then treat all opinions and input as equal
● Involve women at each stage and ensure they are progressing to leadership positions.
● Identify potential senior candidates and mentor them.
● As a manager or team leader, model gender inclusive behavior
● Build the necessary infrastructure, support, and culture to keep women in top positions.

Data reveals that organisations with diverse and inclusive environments are reaping various benefits that translate into better products, a competitive advantage, and increasing profits. To build an effective solution, analytics teams should be as diverse as the populations that their solution will impact - breaking some biases today starts you on that journey.