Written by Claire Houston, speaker at AICon and Software Engineer at Kainos.
Biased AI isn’t a future problem, it’s a current one. From credit cards to health diagnoses, these issues are going viral and up for debate across the global stage of twitter.
Artificial Intelligence is created and used by humans, trained to model our understanding and our intelligence. The theory is to use AI programs to make decisions faster and more intelligently than humans can and we are fast relying on them in our daily lives. But for them to do as we do, they reflect who we are and so must also reflect our biases.
We’re seeing these flaws of AI programs in a wide range of areas, from health to banking- wherever these systems have reach. A benefit however, of this brave new world of apps and AI, is the global platform they create. People are using it to shed a spotlight on these programs, to show the discrimination they’re causing. And as we all know, when a tweet goes viral- it gets everywhere.
What is biased AI?
AI is the creation of programs that learn from data to make human-like decisions. A new AI program looks at all the decisions that were made previously, identifying patterns in the data and tries to identify what causes a decision to be made. When making a new decision, the program does a best guess based on all previous decisions. So, the data used to create an AI program is vitally important; it is leveraged to make all decisions.
Icons made by smashicons from www.flaticon.com
Basing all new decisions on previous decisions can be a key problem. The algorithms we use are not able to determine the true correct answer, they can only come up with the same answer humans came up with before, introducing all the flaws and biases that already exist in any given system. Another challenge is the transparency of an AI program. In some cases, researchers can understand why a decision has been made, but other times the program becomes too complex and the researcher can’t pinpoint the exact reason. When a decision can’t be traced, the AI program is called a black box
Biased AI is an AI program that has been influenced to make decisions which favour one group over another and is caused by biased data. A simple metaphor for how this can happen is to teach a child the names of animals.
Icons made by flaticon from www.flaticon.com
We show a child lots of pictures of sheep, and a few pictures of cows, say 60 sheep and 5 cows. Then we take them to a farm and ask them to play a game, Guess the Animal- sheep or cow. The game has a reward, a sweet, for every animal correctly guessed. Chances being they will guess sheep more often than cow, because they know they’re more likely to be rewarded. They’ll take the risk on getting a few wrong cows for maximising the reward-their bias will skew their answers.
In our example, the bias is easy to both identify and solve; too many sheep so show more cows. But in the real world with complex problems, biases will not be easy to identify and are caused by a host of different issues. Maybe the data set is out of date, calling back to decisions made when cultural standards were different. Maybe the data set isn’t varied enough, not representing our diverse and global population. If these datasets aren’t studied carefully their biases will be passed onto the AI, causing problems, large problems, with the potential go viral.
It was the algorithm!
A recent Twitter storm was caused in November by Apple and Goldman Sachs for their joint credit card the apple card, released 20th of August this year, which is offering some women a much lower rate of credit than their husbands. David Heinmer Hansson, the creator of Ruby on rails and co-founder and CTO of Basecamp, wrote a viral thread about the issue. Specifically, that he had received 20 times the credit that his wife had even though she had a better credit score and, unlike him, was an American citizen. Two basic facts that would arguably put her in a position to receive an equal or larger line of credit. The thread took off, receiving thousands of likes and retweets. Many followers agreed with DHH, including (slightly crazily) Apple’s co-founder Steve Wozniak whose wife had a similar problem- he was offered a credit limit 10 times greater than hers. Wozniak explained he and his wife have shared accounts, cards and assets- the difference in credit should have been negligible. It should be noted that although both are wealthy men, many people have come forward from a variety of backgrounds with similar stories.
Tweet by Steve Wozniak (@stevewoz)
Both DHH and Steve Wozinak agreed that it was hard to find a human to correct the mistake, DHH tweeting that on two separate occasions Apple staff simply couldn’t provide an answer. The first agent blamed the lower credit on the algorithm and the second agent hinted that their credit scores would justify the disparity, which DHH asserted was factually untrue. DHH expressed his anger at this line of defence, the companies hiding behind their algorithm.
So nobody understands THE ALGORITHM. Nobody has the power to examine or check THE ALGORITHM. Yet everyone we’ve talked to from both Apple and GS are SO SURE that THE ALGORITHM isn’t biased and discriminating in any way. That’s some grade-A management of cognitive dissonance.
— DHH (@dhh) November 8, 2019
Goldman Sachs have released an statement, claiming that the algorithm doesn’t know your gender or marital status. But with AI this doesn’t always matter, it can connect the data points and make decisions based on those data points the same as if it had the original information- i.e. gender. The customer agents not being able to justify or pinpoint the exact problem is a very worrying precedent to set, and one that does not look good for an AI driven future. Apple so far have not commented.
The controversy has blown up and moved beyond Twitter, with media outlets from across the world reporting. Presidential candidate Elizabeth Warren has spoken out against Apple and Goldman Sachs suggesting that if the company cannot explain how the algorithm makes decisions it should be taken down. She was firmly against Goldman Sachs’ current stance, where the responsibility of correction was put on the individual, arguing “So let’s just tell every woman in America, ‘You might have been discriminated against, on an unknown algorithm, it’s on you to telephone Goldman Sachs and tell them to straighten it out.’ Sorry, guys, that’s not how it works.” The responsibility should not be on the customer to fix a broken system, Apple and Goldman Sachs should not offer a faulty product.
It was the chatbot!
Back in September, a health related storm broke out about the NHS funded app, Babylon. In the UK Babylon’s app provides NHS services to patients in England, offering them to see a GP “in minutes”. They also offer a tool called symptom checker, which is essentially a chatbot that helps patients get medical information and understand if they need to see a doctor.
Babylon GP at Hand patients can see a GP 24/7 from their phone, which eases the pressure on our #NHS A&E https://t.co/1rJ8Ht8oEN
— Babylon GP at Hand (@GPatHand) November 14, 2019
A Twitter user by the handle of Dr Murphy- who states he is an NHS consultant, performed a test on Babylon’s symptom checker. He went through the exact same symptoms for a man and a woman, each having a heart attack, of the same age and background, but found the different genders received different medical advice. For the woman her symptoms returned results of anxiety or depression, while the man’s symptoms were linked to angina or a heart attack. Both fictional patients were suffering from heart attacks, but only one was recommended to see a doctor.
The @babylonhealth Chatbot has descended to a whole new level of incompetence, with #DeathByChatbot #GenderBias.
— Dr Murphy (@DrMurphy11) September 8, 2019
Classic #HeartAttack symptoms in a FEMALE, results in a diagnosis of #PanicAttack or #Depression.
The Chatbot ONLY suggests the possibility of a #HeartAttack in MEN! pic.twitter.com/M8ohPDx0LX
In response, Babylon released a blog on the topic of bias in medicine to explain their app’s actions. They wrote of their awareness of the biases in medical diagnoses, specifically faced by women, and how they aim to prevent this bias within their app. They even focused on the fact their AI could be less biased than a human doctor, as it’s less likely to be swayed by previous cases. Babylon’s app will always recommend the most probable result based on its current data.
Two compelling arguments for their app, but when combined instead serve to highlight its potential dangers. Their app will not tell a woman with heart attack symptoms to go to see a doctor, as based on their current, biased data, a heart attack is not the most probable result. Their AI will give the correct result to 99 people, or 999 or 9999, but for that person who receives the wrong answer, it could have potentially lethal results.
Both of the above cases show AI working perfectly well for majority but discriminating against a minority. Both of these examples appeared on twitter and subsequently received extensive mainstream media coverage, but future biases may not be so obvious. These are only two examples; more examples of biased AI will arise; and not just sexism as we’ve seen here. Race, age, sexuality, new bias could be against any of these.
What can be done?
Biases are an unfortunate reality, but biased AI doesn’t have to be. There are many ways we can help prevent biased AI.
Lawmakers can help protect us and react to these new developments. The New York State Department of Financial Services are investigating Goldman Sachs over the Apple card, due to it potentially breaking sex discrimination laws.
#NewYork law prohibits discrimination against protected classes of individuals. Therefore, @NYDFS will examine whether the algorithm used to make these credit limits decisions violates state laws.
— Linda Lacewell (@LindaLacewell) November 10, 2019
In her medium post Linda Lacewell, NYFDS Superintendent, wrote “ For innovation to deliver lasting and sustained value, the consumers who use new products or services must be able to trust they are being treated fairly.”
Information sharing can also help prevent biased AI, both by leveraging Twitter to call attention to problems as we’ve seen here but also to educate. Twitter gives those with knowledge on AI and AI ethics the chance to speak out, and address the potential harm caused by these biased AI programs. An example is Arvind Narayan, a comp sci professor at Princeton, speaking out against researchers releasing tools that are not fit for purpose.
Whenever someone points out sexist/racist stereotypes in a new AI tool, you’ll find lots of apologists in the comments saying, "What's the surprise? It just reflects the training data".
— Arvind Narayanan (@random_walker) September 23, 2019
Well, the "surprise" is that researchers keep releasing these tools as if everything’s fine. https://t.co/YTG2pLhcu1
A final suggestion to stop biased AI is to prevent it at the root. Why let these problems impact people when we could remove the biases at the data stage, before it gets baked into the AI programs. I’m not going to pretend that’s a simple task, as it will require more thought than simply throwing all of our data at a problem.
This solution involves diverse teams, having groups of people from all backgrounds that are be able to spot the issues that a mono-group will not. It involves studying data, identifying where commonalities and differences lie and ensuring that the data is representative. It involves testing the AI thoroughly, making sure meets required standards of all user needs before going out to market.
In short- stopping biased AI at the root will require contributions from everyone. Ensuring fair and unbiased AI programs will require effort but is something that we all need to strive for.
What next?
If you are interested in hearing about this further, I spoke on “Biased Data, Biased AI” at the sold-out AI Con 2019. I covered a selection of biases around us and discussed in detail the ways we can work to prevent bias. There will be recordings released soon on the website, so watch this space!