Q&A: Industry-first research on AI gender bias (Includes interview) – Digital Journal

Q&A: Industry-first research on AI gender bias (Includes interview) – Digital Journal

Cogito researchers were able to develop state-of-the-art speech emotion recognition models that were trained on a well-known and comprehensive speech emotion dataset and validate two types of machine learning de-biasing techniques. Notably, one of the training procedures can be scaled and applied to other scenarios with protected variables. Cogito has recently presented its findings in a whitepaper delivered to the Interspeech Conference. While some previous research has investigated emotion and gender bias separately, the Cogito research is the first to investigate gender de-biasing directly relating to speech emotion recognition. To understand more, Digital Journal spoke with Dr. John Kane, Distinguished Scientist, Machine Learning, Cogito Corp. Digital Journal: How sophisticated is AI becoming?Dr. John Kane: Artificial intelligence (AI) continues to evolve and impact our personal lives and virtually every industry. In the recent years, there have been countless advancements including recent developments in automated cancer screening and the introduction of expressive and emotive modes in virtual assistants. It’s an exciting time to be part of the AI community! Such developments of commercial AI systems are largely enabled by ever-improving machine learning (ML) models, which use complex neural network architectures to exploit the value contained within very large volumes of data. At the same time, the machine learning discipline faces many challenges. The media frenzy in this area has elevated expectations in the eyes of business leaders and the general public in terms of ML models’ accuracy. Modern models are often communicated in the press as being seemingly fully automated; however, the reality is that most commercial systems involve extremely large volumes of human labeling, correction of data and teams of skilled scientists and engineers. Additionally, with AI systems becoming increasingly ubiquitous, there is a high demand for ML models to be explainable or interpretable – for instance, the need to answer questions like, “why did the algorithm decline my loan application?” Another key challenge to this technology is the issue of bias. With ML technology being used in a large variety of applications, researchers and the general public are concerned that these algorithms are not just accurate in the overall sense, but also fair towards different sections of the population. DJ: Why do many types of AI have inherent biases? Kane: Machine learning models, and consequently AI systems, inherent bias largely as a result of the data which is used to train and develop them. If training data for a machine learning model contains some form of bias, it is extremely likely that the resulting models will perpetuate that bias. Sampling bias is one problem that arises here. If certain population categories, like elderly speakers, are missing from the training data, then the models developed will likely do a poor job when applied to that category, compared to other groups which were well represented in the training data. Problems can also occur if the training data already has stereotypes embedded within it. This was highlighted previously where sentences generated from natural language generation (NLG) models trained on a Google News database clearly demonstrated gender bias with an archaic view of female occupations. As commercial ML-powered systems are developed using labeled training data, often provided by humans, any bias – be it conscious or unconscious – will be encoded into any models trained with it. Although there are other factors which can introduce gender bias into ML models and AI systems, the nature of the training data is the most significant one. DJ: How does gender bias arise and what form does it take? Kane:Bias generally occurs when a group in the population is treated unfairly. Unfairness can take different forms – it can involve perpetuating negative stereotypes, like the natural language generation example previously highlighted. It can also take the form of lower model accuracy for one group compared to the rest of the population. The specific case of gender bias, for instance, has been present in speech technology for several decades where computer voices (speech synthesis), as well as automatic speech recognition and speaker verification algorithms all performed poorly for female voices compared to males. While recent neural network developments in speech technology have reduced this bias considerably, there are still areas where gender bias persists – for instance in the case of emotion recognition from speech. DJ: What has Cogito’s research shown? Kane:Our recent publication documented the first research which explicitly examined the issue of gender bias in speech emotion recognition. The research first involved an analysis of modern speech emotion recognition models trained on and applied to a large, naturalistic and well-known dataset of speech from podcasts and radio shows. The models were then trained to estimate human perception of “emotional activation,” which is the level of intensity of emotion. Our evaluation indicated that machine learning models make more mistakes in recognizing low “emotional activation” for female speech samples compared to male. So, for instance, these models would do a better job in correctly identifying male speech as being calm compared to female speech, where it would more often confuse it with being excited or intense. The beauty of current neural network-based machine learning is that if you can create some mathematical definition of a phenomenon and you can train a model to optimize for it. Oftentimes, ML practitioners look to optimize or minimize some error, like recognizing the emotion incorrectly, but you can add other mathematical definitions of “bad” to be minimized, like a definition of gender bias or unfairness. The second part of our research did exactly this, introducing a mathematical definition of gender bias and then demonstrated that a ML model could be trained to minimize both the emotion recognition error and the definition of gender bias jointly. The results indicated a clear reduction of gender bias compared to contemporary models, while maintaining state-of-the-art level of accuracy. DJ: What measures can be taken to reduce bias? Kane:As we shift our attention to place higher importance on creating AI and ML models that are fair for everyone, having open and honest conversations about the causes and solutions of bias is a great first step in reducing it – especially knowing that it will require an industry team effort. DJ: Are there any best practice examples? Kane:There are four best practices that we can all follow as we strive to remove bias from our AI technology. First, actively ensure diversity is incorporated in the training samples. For example, pull an equal number of female and male audio samples for your training data – diversity of thought, upbringing, education and experiences all play a role when combating bias in AI. Second, ensure the humans who label the audio samples also come from diverse backgrounds. This is another best practice we can act on when collecting more training data. When paired with modern machine learning techniques, diversity in research teams enables us to better identify bias and improve the data used to train the ML models. Third, leaders should be actively engaging with their teams and scientists to monitor and identify bias as it arises to put an end to it as soon as possible. The key is to develop definitions of fairness and develop quantitative metrics around these definitions which can be monitored and audited. Having conversations across your organization will be paramount for reducing AI bias in 2020 and beyond. Fourth, apply de-biasing machine learning techniques where necessary. If you observe unfairness in your models for some groups in the population, and adding more training data from that group either is not possible or does not help consider applying de-biasing techniques where you look to optimize your model not just for the primary accuracy objective but also to reduce bias towards the sensitive group. Source

Leave a Reply

%d bloggers like this: