The Promise and Peril of AI in Women’s Health



By Lucy Erickson, PhD, SWHR Director of Science Programs

Depending on who you ask, artificial intelligence (AI) is either a miracle worker that will solve the world’s problems or the new snake oil. Of course, the reality lies somewhere in between and is substantially more complex.

Much of the current excitement in AI surrounds an advanced technique called deep learning. Deep learning algorithms are a powerful evolution of machine learning, which essentially allows a computing system to learn by making predictions about the data it is fed.

Deep learning techniques have great potential to transform health care and aid in the diagnosis and treatment of disease. For example, Mayo Clinic recently found that applying AI to an electrocardiogram — a standard test that measures the electrical activity of the heartbeat — could reliably detect asymptomatic left ventricular dysfunction, a precursor to heart failure.

Applying AI to women’s health specifically, researchers at MIT created a deep learning algorithm to read mammograms to assess breast density and Google teamed up with universities and medical centers to design an algorithm that appears to read mammograms better than radiologists, reducing the number of false negatives and false positives. Finally, researchers at NIH and Global Good developed an algorithm to analyze digital images of a woman’s cervix and identify precancerous changes.

But there is also a real possibility for AI, under the guise of objectivity, to perpetuate and increase longstanding biases in the medical field.

AI algorithms are prone to propagating any biases present in the data from which they learn. In other words, AI systems are only as good as the data on which they are trained. Thus, problems arise when applying AI to fields that are already susceptible to bias, such as health care, which has historically been biased against women, people of color, and other underrepresented groups.

Examples of bias in AI algorithms harming already vulnerable populations are commonplace. For example, Amazon was forced to scrap a new recruiting tool when it learned to avoid recommending hiring women and even the Apple credit card has recently come under fire for being biased against women.

Data scientist Cathy O’Neil has dubbed such mathematical models, which masquerade as valid measures of important traits while reinforcing and amplifying inequity, as “Weapons of Math Destruction,” due to their potential to cause societal harm.

Historically, health research has tended to study white men, and until 1993, women of childbearing age were actively excluded from most clinical trials. Moreover, the National Institutes of Health only began requiring NIH-funded researchers to consider sex as a biological variable in their preclinical studies in animals and cells in 2016. Although the medical field is moving in the right direction, there is a lot of work to be done before it can overcome its biases against women and people of color.

Because AI tools tend to readily soak up the biases that already exist in the human-generated data that they are fed, there is good reason to be wary of AI solutions in the health space. Consider the recent report that a medical algorithm learned to favor white patients over black patients in health care programs. The algorithm did not intentionally use race to make decisions, which would have been illegal. Instead, bias against black patients emerged, unintentionally and organically, through patterns present in the data used to train the algorithm. The result was that the algorithm learned to allocate fewer resources to sick black patients compared to white patients.

Given the bias against women in the medical field and the propensity of AI to learn biases from its training data, AI health applications could be similarly, unintentionally biased against women. Moreover, bias may be especially harmful for individuals with membership in multiple vulnerable groups (e.g., black women).

Algorithmic harms do not usually come from a malevolent actor, but rather occur when a well-meaning actor fails to catch a harmful byproduct of the algorithm they created. In addition to issues with biases in data, the people who create deep learning algorithms also have implicit biases, which may inadvertently be reflected in their programs. The lack of diversity in the field of AI likely contributes to this problem, with one estimate suggesting just 12% of machine learning researchers are women.

Addressing AI biases is challenging as some algorithms are proprietary and have been described as “black box” models, where even the creator of an algorithm may not be able to fully explain the outcomes and how the algorithm arrived to a specific conclusion. Moreover, researchers who study algorithmic fairness have noted that, in some situations, building all the types fairness that are desired into models may be mathematically impossible.

Ultimately, AI is a tool that businesses are using to speed up and remove friction to solve problems. But mitigating bias requires considering who may be affected by a decision — intentional or not — and considering ethics at every stage of data analysis and automated decision-making. Members of the technology community have been calling for greater awareness of ethical issues, drafting ethical codes for data science, and researching ways to identify and mitigate bias in AI.

Despite the challenges, AI represents an opportunity to leverage the enormous amounts of data being stored in electronic data records to improve care and disease outcomes. The field can take advantage of ongoing initiatives to collect data on diverse populations, such as the National Institutes of Health’s All of Us Program, which is building the most diverse health databases in history, or Verily’s Project Baseline, which is collecting health data from populations who typically do not participate in clinical trials. We are optimistic about the field’s ability to overcome the perils of AI and harness its promise for women’s health.

By Lucy Erickson, PhD, SWHR Director of Science Programs

Depending on who you ask, artificial intelligence (AI) is either a miracle worker that will solve the world’s problems or the new snake oil. Of course, the reality lies somewhere in between and is substantially more complex.

Much of the current excitement in AI surrounds an advanced technique called deep learning. Deep learning algorithms are a powerful evolution of machine learning, which essentially allows a computing system to learn by making predictions about the data it is fed.

Deep learning techniques have great potential to transform health care and aid in the diagnosis and treatment of disease. For example, Mayo Clinic recently found that applying AI to an electrocardiogram — a standard test that measures the electrical activity of the heartbeat — could reliably detect asymptomatic left ventricular dysfunction, a precursor to heart failure.

Applying AI to women’s health specifically, researchers at MIT created a deep learning algorithm to read mammograms to assess breast density and Google teamed up with universities and medical centers to design an algorithm that appears to read mammograms better than radiologists, reducing the number of false negatives and false positives. Finally, researchers at NIH and Global Good developed an algorithm to analyze digital images of a woman’s cervix and identify precancerous changes.

But there is also a real possibility for AI, under the guise of objectivity, to perpetuate and increase longstanding biases in the medical field.

AI algorithms are prone to propagating any biases present in the data from which they learn. In other words, AI systems are only as good as the data on which they are trained. Thus, problems arise when applying AI to fields that are already susceptible to bias, such as health care, which has historically been biased against women, people of color, and other underrepresented groups.

Examples of bias in AI algorithms harming already vulnerable populations are commonplace. For example, Amazon was forced to scrap a new recruiting tool when it learned to avoid recommending hiring women and even the Apple credit card has recently come under fire for being biased against women.

Data scientist Cathy O’Neil has dubbed such mathematical models, which masquerade as valid measures of important traits while reinforcing and amplifying inequity, as “Weapons of Math Destruction,” due to their potential to cause societal harm.

Historically, health research has tended to study white men, and until 1993, women of childbearing age were actively excluded from most clinical trials. Moreover, the National Institutes of Health only began requiring NIH-funded researchers to consider sex as a biological variable in their preclinical studies in animals and cells in 2016. Although the medical field is moving in the right direction, there is a lot of work to be done before it can overcome its biases against women and people of color.

Because AI tools tend to readily soak up the biases that already exist in the human-generated data that they are fed, there is good reason to be wary of AI solutions in the health space. Consider the recent report that a medical algorithm learned to favor white patients over black patients in health care programs. The algorithm did not intentionally use race to make decisions, which would have been illegal. Instead, bias against black patients emerged, unintentionally and organically, through patterns present in the data used to train the algorithm. The result was that the algorithm learned to allocate fewer resources to sick black patients compared to white patients.

Given the bias against women in the medical field and the propensity of AI to learn biases from its training data, AI health applications could be similarly, unintentionally biased against women. Moreover, bias may be especially harmful for individuals with membership in multiple vulnerable groups (e.g., black women).

Algorithmic harms do not usually come from a malevolent actor, but rather occur when a well-meaning actor fails to catch a harmful byproduct of the algorithm they created. In addition to issues with biases in data, the people who create deep learning algorithms also have implicit biases, which may inadvertently be reflected in their programs. The lack of diversity in the field of AI likely contributes to this problem, with one estimate suggesting just 12% of machine learning researchers are women.

Addressing AI biases is challenging as some algorithms are proprietary and have been described as “black box” models, where even the creator of an algorithm may not be able to fully explain the outcomes and how the algorithm arrived to a specific conclusion. Moreover, researchers who study algorithmic fairness have noted that, in some situations, building all the types fairness that are desired into models may be mathematically impossible.

Ultimately, AI is a tool that businesses are using to speed up and remove friction to solve problems. But mitigating bias requires considering who may be affected by a decision — intentional or not — and considering ethics at every stage of data analysis and automated decision-making. Members of the technology community have been calling for greater awareness of ethical issues, drafting ethical codes for data science, and researching ways to identify and mitigate bias in AI.

Despite the challenges, AI represents an opportunity to leverage the enormous amounts of data being stored in electronic data records to improve care and disease outcomes. The field can take advantage of ongoing initiatives to collect data on diverse populations, such as the National Institutes of Health’s All of Us Program, which is building the most diverse health databases in history, or Verily’s Project Baseline, which is collecting health data from populations who typically do not participate in clinical trials. We are optimistic about the field’s ability to overcome the perils of AI and harness its promise for women’s health.