Anuj Shrestha

To pediatrician Nader Shaikh, the rhythm of treating babies running high fevers is familiar. After ruling out the obvious colds and other common viruses, he must often thread a catheter into a months-old baby to draw a urine sample and check for a urinary tract infection (UTI). “You have to hold the baby down, the baby’s crying, the mother is usually crying too,” says Shaikh, who works at the University of Pittsburgh. “It’s traumatic.”

UTIs, although relatively rare in children under age 2, carry a high risk of kidney damage in this group if left untreated. Often, the only symptom is a high fever. But high fevers can also signal a brain or blood infection, or a dozen other illnesses that can be diagnosed without a urine sample. To help clinicians avoid the unnecessary pain and expense of catheterizing a shrieking infant, Shaikh and his colleagues developed an equation that gauges a child’s risk of a UTI based on age, fever, circumcision status, gender, and other factors—including whether the child is Black or white. Race is part of the equation because previous studies found that—for reasons that aren’t clear—UTIs are far less common in Black children than in white ones.

The UTI algorithm is only one of several risk calculators that factor in race, which doctors routinely use to make decisions about patients’ care. Some help them decide what tests to perform next or which patients to refer to a specialist. Others help gauge a patient’s lung health, their ability to donate a liver or kidney, or which diabetes medicines they need.

In the past few years, however, U.S. doctors and students reckoning with racism in medicine have questioned the use of algorithms that include race as a variable. Their efforts gained momentum thanks to the Black Lives Matter movement. In August 2020, a commentary published in The New England Journal of Medicine (NEJM) highlighted the use of race in calculators as a problem “hidden in plain sight.” It’s widely agreed that race is a classification system designed by humans that lacks a genetic basis, says Darshali Vyas, a medical resident at Massachusetts General Hospital and co-author on the paper. “There’s a tension between that [understanding] and how we see race being used … as an input variable in these equations,” Vyas says. “Many times, there’s an assumption that race is relevant in a biological sense.”

Vyas and others warn that using race to adjust risk calculators may also widen existing health disparities. Black Americans are generally diagnosed with kidney disease later than white Americans, which delays treatment and puts them at greater risk of developing kidney failure—yet an equation widely used to measure kidney function tends to estimate better function for Black patients relative to non-Black patients. Osteoporosis is underdiagnosed and undertreated in Black women, but a common bone fracture risk calculator places them, along with Asian and Hispanic women, at lower risk than white women. “We know these disparities exist, yet the calculators tell us that we don’t need to worry about this population,” says epidemiologist Anjum Hajat of the University of Washington, Seattle.

Some of these calculations are rooted in racist assumptions. Others emerged out of an effort to improve predictions across racial groups. The challenge of defining “normal” versus “diseased” and capturing these qualities accurately in a simple test led scientists to grasp whatever data they could to make their tools more accurate. And at a population scale, race often does correlate with medical outcomes, in part because it acts as a proxy for the influence of other socioeconomic factors on health.

Deeper than skin color

Doctors use risk calculators to help decide a person’s prescriptions,
their risks during surgery or childbirth, or when to refer them to
a specialist. Race corrections sway this math in varied ways.

Click on or hover over an organ to learn more about each calculator.

But even if racial trends sharpen predictions, using them to make decisions about an individual’s treatment is problematic, Hajat says. “Even if a calculator is not causing disparities, it is maintaining and perpetuating them,” she says. For some, applying a different standard to Black patients than to white ones recalls a long history of neglect and discrimination in medicine. “I don’t think people had bad intentions when they were creating these calculators,” Hajat says. “But we have to be aware that biomedical research has really contributed to upholding white supremacy, which is why we’re reexamining the calculators now.”

The questions are already spurring change. In March, a task force from the American Society of Nephrology and the National Kidney Foundation recommended removing race as a variable in the kidney function calculator, known as the estimated glomerular filtration rate (eGFR) equation. The University of Washington, Beth Israel Deaconess Medical Center, and others have already dropped race from their eGFR calculations.

But similar efforts met resistance at other institutions. To some researchers and clinicians, the use of calculators that incorporate race seems not just appropriate, but a crucial measure to avoid unnecessary medication or invasive treatments, such as a catheter in a 6-month-old baby. Shaikh sees the UTI equation’s use of race as an effort to achieve equity, not worsen disparities. “It sounds weird to use race to pick patients, and it doesn’t look good on the surface,” he says. “But which one is worse: catheterizing kids who don’t need it or using race in an algorithm? It’s more complicated than it seems.”

The history of racism in U.S. medicine dates back to the nation’s earliest medical schools. Benjamin Rush, one of the physicians who signed the Declaration of Independence, once described Blackness as a form of leprosy that could be cured to restore the “natural white flesh color.”

At least two modern-day risk calculators have been accused of having similarly racist logic: One, which estimates a woman’s odds of successful vaginal birth after cesarean section (VBAC), falsely assumes that women’s pelvis shapes differ based on race, making this form of childbirth riskier for Black and Hispanic women compared with white women. Another equation estimates lung function by gauging the maximum amount of air a person can exhale forcefully into an instrument called a spirometer. Lower measurements are considered normal for Black and Asian people, based on the disputed assumption that their lung capacity is lower. “The spirometer was built on anti-Black racism,” says Lundy Braun of Brown University, who studies the history of racial health disparities. The VBAC calculator was updated to remove race in May, but spirometers still include a race adjustment. The American Thoracic Society (ATS) has begun to examine its use, Braun says.

In other calculators, race has been added to bring measurements in line with the best available data. The eGFR equation, developed in 1999, estimates how well a person’s kidneys function based on urinary levels of a compound called creatinine, which builds up in blood when kidney filtration declines. Because the equation doesn’t test kidney function directly, its developers compared its results with kidney filtration rates measured using a more definitive test, based on a radioactive tracer, that is too complex to perform routinely. They found the eGFR equation consistently underestimated kidney function in Black patients, so they used a common statistical method called curve fitting to adjust the estimates according to race.

Fractured risks

A bone fracture risk assessment tool includes individual patient details and history. But it also accounts for race, placing Black, Asian, and Hispanic people at lower risk of osteoporosis than white people, based on the lower incidence of the condition in these groups.

0 5 10 15 20 25 30 ≥10{13aab5633489a05526ae1065595c074aeca3e93df6390063fabaebff206207ec} 5–10{13aab5633489a05526ae1065595c074aeca3e93df6390063fabaebff206207ec} 3–5{13aab5633489a05526ae1065595c074aeca3e93df6390063fabaebff206207ec} 9 17 9 19 2 26 Received osteoporosis therapy ({13aab5633489a05526ae1065595c074aeca3e93df6390063fabaebff206207ec}) Calculated hip fracture risk Black women White women Lifestyle changes or treatment are recommended whenfracture risk exceeds 3{13aab5633489a05526ae1065595c074aeca3e93df6390063fabaebff206207ec}. But in a recent study, Blackwomen were consistently less likely to receive prescription osteoporosis therapies–even when they were at high risk. Delayed doses 0 2 4 6 8 10 Probability of fracture ({13aab5633489a05526ae1065595c074aeca3e93df6390063fabaebff206207ec}) 1 8.3 0.4 3.7 0.5 4.5 0.5 4.6 Caucasian Black Asian Hispanic Hip All major fractures U.S.(Caucasian) U.S.(Black) U.S.(Asian) U.S.(Hispanic) NO: Previous fracture, fractured hip in parent,smoking, glucocorticoids, rheumatoid arthritis,secondary osteoporosis, alcohol (3 or more units/day) NO: –2.1 grams per centimeter squared Country: ? Femoral neck bone mineral density: 155 centimeters Height: 70 kilograms Weight: Female Sex: 65 Age: Fracture risk assessment tool inputs

(Graphic) K. Franklin/Science; (Data) J. Curtis et al., J Gen Intern Med., 24(8): 956 (2009)

Other risk calculators have added race in an effort to better match epidemiological data. In 1992, the World Health Organization recognized an epidemic of osteoporosis and funded research to develop a tool that could assess a person’s risk for fractures based on the brittleness of their bones. Researchers developed several country-specific versions of the tool for the United States, Canada, South Africa, and others, which incorporated race-specific prevalence where data were available.

When adapting the equation to U.S. populations, the researchers included a race correction to account for the lower reported occurrence of osteoporosis in Black women. The goal was to avoid medicating people who didn’t need it, and the correction brought fracture predictions in line with official rates of disease.

And in this case, the differences may have a physiological underpinning, says epidemiologist Nicole Wright of the University of Alabama, Birmingham, who studies disparities in bone health. “Genetically, people of African descent have higher bone mass than others, so you need to account for that,” she says. “If you don’t have osteoporosis and you’re taking these medications, they do come with some risks.”

It’s unclear, however, whether the lower incidence of osteoporosis in Black women is also influenced by missed diagnoses due to lack of access to care, delayed screening, or the exclusion of these women from research studies of the condition.

Despite such uncertainties, it has been hard to resist including race in the calculators. “When you plug variables into a model and see a large effect, it seems like race is representing something and it should be in the calculator,” Hajat says. It’s convenient to see race as a variable like age, which is “predictive of everything related to health,” she says. “The problem is that race and age are fundamentally different things.”

If race doesn’t represent a biological difference between patients, why would including it in risk equations improve predictions at all? Even researchers who develop such equations often don’t know exactly why race matters. “We sometimes use surrogate measures that can identify people at different levels of risk, even if we don’t understand the exact factors driving their risk,” says epidemiologist Montserrat García-Closas of the National Cancer Institute, who has worked on various cancer risk calculators. “It doesn’t really matter if these are true [disease-causing] factors, which are much harder to establish.”

Also hard to establish are the public health consequences of including race in a given medical risk tool. The eGFR is especially controversial because “you can consider the trade-offs of accuracy versus the harms of the equation,” says medical student James Diao of Harvard Medical School, who is studying alternative, race-free equations. Adding race to the eGFR equation may have made it technically more accurate for Black patients, but it also results in fewer of them being diagnosed with chronic kidney disease.

Deleting race from the tool could counteract long-standing disparities in care, because more Black patients would receive earlier referrals to specialists and get placed on transplant lists sooner. (On the other hand, it might also mean fewer receive certain lifesaving medications for blood pressure, diabetes, and other conditions because of a risk of renal side effects.) Similarly, an analysis presented at the ATS International Conference in May suggests nearly 21{13aab5633489a05526ae1065595c074aeca3e93df6390063fabaebff206207ec} more Black patients would be diagnosed with more severe pulmonary disease—and receive earlier care—if race were removed from the lung function calculator.

But removing race from medical equations is not a matter of simple math. The eGFR equation for kidney function is embedded in electronic medical systems used in hospitals and commercial laboratories. A technician using a spirometer to test lung function must begin by entering a patient’s race along with age, height, and other details. And the fracture risk calculator is built into scanners that test bone density.

Aside from the technical challenges of updating instruments, testing labs are obliged to follow current regulations and standards of care endorsed by professional societies and used in clinics. Without a formal change in guidelines, individual test providers or clinics may find it tough to discard risk models that include race.

Legal constraints might also make it difficult for physicians to phase out race-based calculations, such as those used to gauge the risks of a surgery. “From a liability standpoint, surgeons might be compelled to use what’s validated as the most accurate equation,” Diao says. “Otherwise, it might be seen as a failure to adequately inform a patient about a procedure’s risks prior to consent.”

When Vyas and her colleagues published a list of problematic algorithms in their NEJM commentary, it raised concerns—and ire—across clinical specialties. In September 2020, osteoporosis researcher John Kanis of the University of Sheffield, who developed the bone fracture risk tool, published a commentary in the journal Osteoporosis International arguing that using country-specific incidence trends, including by race, is important to the tool’s accuracy and avoids overdiagnosis. He and his co-authors note, for example, that Black people in the United States have lower fracture risk than white Americans, but their risk is far higher than that of Black people in African countries.

Although the underdiagnosis of osteoporosis among Black people in the United States is a problem, “I don’t think removing race from the calculator would do anything to reduce disparities,” says osteoporosis researcher Michael Lewiecki of the University of New Mexico Health Sciences Center. “These are important problems, but the calculator is not their cause.”

At a faculty meeting last summer, Shaikh says his colleagues discussed whether to respond to the NEJM commentary to refute the suggestion that the use of race in algorithms is always problematic. But Shaikh didn’t see the commentary as a call to change the UTI risk calculator. “To me, it reflected a desire to bring the issue [of race] to the surface, which is laudable,” he says. “I don’t know who is right or wrong here. My view is, these tools have to be based on data.”

Medical student James Diao of Harvard Medical School is one of many physician-researchers studying alternatives to the use of race in risk calculators.

Danielle Duffey/BIDMC

It’s tough to know whether the UTI calculator leads to disparities, and there’s little evidence to suggest UTIs are underdiagnosed or undertreated in Black children. But social justice movements in recent years prompted Shaikh to reexamine the data he used to create the UTI calculator, including the information in previous studies that suggested the occurrence of infections varied by race. “It’s true what the criticisms say that using race is not free of problems,” he says. “If we test people based on race, we are creating a difference. The question is if that difference is to the patient’s benefit—I don’t think the difference itself bothers me as much as the idea that we might cause harm.”

Going forward, any large study that relies on racial differences to develop models of disease risk should undergo additional review before publication to consider potential consequences of applying the research to medicine, says nephrologist Nwamaka Eneanya of the University of Pennsylvania. Researchers also need to go beyond correlating race with health outcomes to pinpoint the actual drivers of health disparities, such as income, education, or neighborhood environmental exposures, she adds. “That’s not a standard that is expected of scientists in this day and age, and it needs to be,” Eneanya says. “This is a wake-up call for the scientific community.”

Replacing race with a different metric is not always easy. Recent studies have attempted to use ZIP codes, income or education levels, or a measure of socioeconomic status called the area deprivation index instead of race to capture conditions that influence health. Precisely how they’d be implemented isn’t clear, and few have been put to work in clinics or endorsed by professional societies of clinicians.

But once researchers and clinicians commit to equity, they often find good alternatives to the use of race, Diao says. In addition to recommending the removal of race from the eGFR equation, nephrology researchers are evaluating a handful of different race-free equations that combine creatinine with other biomarkers such as the protein cystatin C. The tests seem to perform just as well, though they will need further validation and can be more expensive.

Researchers who developed the childbirth risk calculator, which assumes pelvis shapes vary by race, published a new version in May that differs from the original only in the removal of race—and performs just as well. “Presenting this as a choice of using a more accurate equation with race or a less accurate one without it is a false dilemma,” Diao says. “When there’s enough pressure to create an equation that is both accurate and does not use race, researchers rise to the challenge.”

Last year, Shaikh began to work with community organizers to gather parents’ perspectives on the infant UTI calculator. In virtual meetings with these participants, he reenacted the familiar emergency room scenario: A research assistant played the worried parent with a feverish child, while Shaikh played the doctor and explained the use of the calculator.

One participant, a Black father, understood the need to keep race in the equation and drew a parallel to affirmative action. Still, he wondered, was there an alternative? “He didn’t have a problem with it as long as it improved outcomes,” Shaikh recalls. “But given the history of racism in the U.S., it’s a lot to expect people to just trust that it’s going to improve outcomes.” Shaikh eventually homed in on two replacement variables for race in the calculator: the duration of a child’s fever and a prior history of UTIs. He validated the tool and updated the online calculator to a race-free version this week.

Replacing race doesn’t imply researchers shouldn’t continue to seek causes for disparities, Shaikh warns: A child’s history of UTIs is useful in the calculator in part because it captures differences between Black and white children. “We didn’t solve the problem: The data still show a link between race and UTI,” he says. “It’s important to understand that, not bury it.”

Reporting for this story was supported by a project fellowship from the Massachusetts Institute of Technology Knight Science Journalism program.