Note: The Overview section summarizes the published evidence on this topic. The rest of the summary describes the evidence in more detail.
Other PDQ summaries with information related to breast cancer screening include the following:
Mammography is the most widely used screening modality for the detection of breast cancer. There is evidence that it decreases breast cancer mortality in women aged 50 to 69 years and that it is associated with harms, including the detection of clinically insignificant cancers that pose no threat to life (overdiagnosis). The benefit of mammography for women aged 40 to 49 years is uncertain.[1,2] Randomized trials in India, Iran, and Egypt have studied the use of clinical breast examination (CBE) as a screening test. Some of these studies suggested a shift in late-stage disease; however, there is still insufficient evidence to conclude a mortality benefit.[3-8] Breast self-examination has been shown to have no mortality benefit.
Technologies such as ultrasound, magnetic resonance imaging, and molecular breast imaging are being evaluated, usually as adjuncts to mammography. They are not primary screening tools in the average population.
Informed medical decision making is increasingly recommended for individuals who are considering cancer screening. Many different types and formats of decision aids have been studied. For more information, see Cancer Screening Overview.
Randomized controlled trials (RCTs) initiated 50 years ago provide evidence that screening mammography reduces breast cancer–specific mortality for women aged 60 to 69 years (solid evidence) and women aged 50 to 59 years (fair evidence). Population-based studies done more recently raise questions about the benefits for populations who participate in screening for longer time periods.
Magnitude of Effect: Based on a meta-analysis of RCTs, the number of women needed to invite for screening to prevent one breast cancer death depends on the woman’s age: for women aged 39 to 49 years, 1,904 women needed (95% confidence interval [CI], 929–6,378); for women aged 50 to 59 years, 1,339 women needed (95% CI, 322–7,455); and for women aged 60 to 69 years, 377 women needed (95% CI, 230–1,050).[9]
The validity of meta-analyses of RCT demonstrating a mortality benefit is limited by improvements in medical imaging and treatment in the decades since their completion. The 25-year follow-up from the Canadian National Breast Screening Study (CNBSS),[10] completed in 2014, showed no mortality benefit associated with screening mammograms.
Based on solid evidence, screening mammography may lead to the following harms:
For all of these conclusions regarding potential harms from screening mammography, internal validity, consistency, and external validity are good.
The CNBSS trial did not study the efficacy of CBE versus no screening. Ongoing randomized trials, two in India and one in Egypt, are designed to assess the efficacy of screening CBE but have not reported mortality data.[3-8] Thus, the efficacy of screening CBE cannot be assessed yet.
Screening by CBE may lead to the following harms:
BSE has been compared with no screening and has been shown to have no benefit in reducing breast cancer mortality.
There is solid evidence that formal instruction and encouragement to perform BSE leads to more breast biopsies and more diagnoses of benign breast lesions.
Breast cancer is the most common noncutaneous cancer in U.S. women, with an estimated 310,720 cases of invasive disease, 56,500 cases of in situ disease, and 42,250 deaths expected in 2024.[1] Women with inherited risk, including BRCA1 and BRCA2 gene carriers, make up approximately 5% to 10% of breast cancer cases.[2] Men account for about 1% of breast cancer cases and breast cancer deaths.[1]
The biggest risk factor for breast cancer is being female followed by advancing age. Other risk factors include hormonal aspects (such as early menarche, late menopause, nulliparity, late first pregnancy, and postmenopausal hormone therapy use), alcohol consumption, and exposure to ionizing radiation.
Breast cancer incidence in White women is higher than in Black women, who also have a lower survival rate for every stage when diagnosed.[3] This may reflect differences in screening behavior and access to health care. Hispanic women, Asian or Pacific Islander women, and American Indian or Alaska Native women have lower incidence and mortality rates than White or Black women.[4]
Breast cancer incidence depends on reproductive issues (such as early vs. late pregnancy, multiparity, and breastfeeding), participation in screening, and postmenopausal hormone usage. The incidence of breast cancer (especially ductal carcinoma in situ [DCIS]) increased dramatically after mammography was widely adopted in the United States and the United Kingdom.[5] Widespread use of postmenopausal hormone therapy was associated with a dramatic increase in breast cancer incidence, a trend that reversed when its use decreased.[6]
In any population, the adoption of screening is not followed by a decline in the incidence of advanced-stage cancer.
Women with breast symptoms undergo diagnostic mammography as opposed to screening mammography, which is done in asymptomatic women. In a 10-year study of breast symptoms prompting medical attention, a breast mass led to a cancer diagnosis in 10.7% of cases, whereas pain was associated with cancer in only 1.8% of cases.[7]
Breast cancer can be diagnosed when breast tissue cells removed during a biopsy are studied microscopically. The breast tissue to be sampled can be identified by an abnormality on an imaging study or because it is palpable. Breast biopsies can be performed with a thin needle attached to a syringe (fine-needle aspirate), a larger needle (core biopsy), or by excision (excisional biopsy). Image guidance can improve accuracy. Needle biopsies sample an abnormal area large enough to make a diagnosis. Excisional biopsies aim to remove the entire region of abnormality.
DCIS is a noninvasive condition that can be associated with, or evolve into, invasive cancer, with variable frequency and time course.[8] Some authors include DCIS with invasive breast cancer statistics, but others argue that it would be better if the term were replaced with ductal intraepithelial neoplasia, similar to the terminology used for cervical and prostate precursor lesions, and that excluding DCIS from breast cancer statistics should be considered.
DCIS is most often diagnosed by mammography. In the United States, only 4,900 women were diagnosed with DCIS in 1983 before the adoption of mammography screening, compared with approximately 56,500 women who are expected to be diagnosed in 2024.[1,8,9] The Canadian National Breast Screening Study-2, which evaluated women aged 50 to 59 years, found a fourfold increase in DCIS cases in women screened by clinical breast examination (CBE) plus mammography, compared with those screened by CBE alone, with no difference in breast cancer mortality.[10] For more information, see Breast Cancer Treatment.
The natural history of DCIS is poorly understood because nearly all DCIS cases are detected by screening and nearly all are treated. Development of breast cancer after treatment of DCIS depends on the pathological characteristics of the lesion and on the treatment. In a randomized trial, 13.4% of women whose DCIS was excised by lumpectomy developed ipsilateral invasive breast cancer within 90 months, compared with 3.9% of those treated by both lumpectomy and radiation.[11] Among women diagnosed and treated for DCIS, the percentage of women who died of breast cancer is lower than that for the age-matched population at large.[12,13] This favorable outcome may reflect the benign nature of the condition, the benefits of treatment, or the volunteer effect (i.e., women who undergo breast cancer screening are generally healthier than those who do not do so).
Atypia, which is a risk factor for breast cancer, is found in 4% to 10% of breast biopsies.[14,15] Atypia is a diagnostic classification with considerable variation among practicing pathologists.[16]
The range of pathologists' diagnoses of breast tissue includes benign without atypia, atypia, DCIS, and invasive breast cancer. The incidence of atypia and DCIS breast lesions has increased over the past three decades as a result of widespread mammography screening, although atypia is generally mammographically occult.[17,18] Misclassification of breast lesions may contribute to either overtreatment or undertreatment of lesions—with variability especially in the diagnoses of atypia and DCIS.[16,19-23]
The largest study on this topic, the B-Path study, involved 115 practicing U.S. pathologists who interpreted a single-breast biopsy slide per case, and it compared their interpretations with an expert consensus-derived reference diagnosis.[16] While the overall agreement between the individual pathologists’ interpretations and the expert reference diagnoses was highest for invasive carcinoma, there were markedly lower levels of agreement for DCIS and atypia.[16] As the B-Path study included higher proportions of cases of atypia and DCIS than typically seen in clinical practice, the authors expanded their work by applying Bayes’ theorem to estimate how diagnostic variability affects accuracy from the perspective of a U.S. woman aged 50 to 59 years having a breast biopsy.[19] At the U.S. population level, it is estimated that 92.3% (95% confidence interval [CI], 91.4%–93.1%) of breast biopsy diagnoses would be verified by an expert reference consensus diagnosis, with 4.6% (95% CI, 3.9%–5.3%) of initial breast biopsies estimated to be overinterpreted and 3.2% (95% CI, 2.7%–3.6%) under interpreted. Figure 1 shows the predicted outcomes per 100 breast biopsies, overall and by diagnostic category.
To address the high rates of discordance in breast tissue diagnosis, laboratory policies that require second opinions are becoming more common. A national survey of 252 breast pathologists participating in the B-Path study found that 65% of respondents reported having a laboratory policy that requires second opinions for all cases initially diagnosed as invasive disease. Additionally, 56% of respondents reported policies that require second opinions for initial diagnoses of DCIS, while 36% of respondents reported mandatory second opinion policies for cases initially diagnosed as atypical ductal hyperplasia.[24] In this same survey, pathologists overwhelmingly agreed that second opinions improved diagnostic accuracy (96%).
A simulation study that used B-Path study data evaluated 12 strategies for obtaining second opinions to improve interpretation of breast histopathology.[25] Accuracy improved significantly with all second-opinion strategies, except for the strategy limiting second opinions only to cases of invasive cancer. Accuracy improved regardless of the pathologists’ confidence in their diagnosis or their level of experience. While the second opinions improved accuracy, they did not completely eliminate diagnostic variability, especially in the challenging case of breast atypia.
Women with an increased risk of breast cancer caused by a BRCA1 or BRCA2 genetic mutation might benefit from increased screening. For more information, see BRCA1 and BRCA2: Cancer Risks and Management.
Women with Hodgkin and non-Hodgkin lymphoma who were treated with mantle irradiation have an increased risk of breast cancer, starting 10 years after completing therapy and continuing life-long. Therefore, screening mammography has been advocated, even though it may begin at a relatively young age.[26,27]
The potential benefits of screening mammography occur well after the examination, often many years later, whereas the harms occur immediately. Therefore, women with limited life expectancy and comorbidities who suffer harms may do so without benefit. Nonetheless, many of these women undergo screening mammography.[28] In one study, approximately 9% of women with advanced cancer underwent cancer screening tests.[29]
Screening mammography may yield cancer diagnoses in approximately 1% of women aged 66 to 79 years, but most of these cancers are low risk.[30] The question remains whether the diagnosis and treatment of localized breast cancer in older women is beneficial.
There is no evidence of benefit in performing screening mammography in average-risk women younger than 40 years.
Approximately 1% of all breast cancers occur in men.[31] Most cases are diagnosed during the evaluation of palpable lesions, which are generally easy to detect. Treatment consists of surgery, radiation, and systemic adjuvant hormone therapy or chemotherapy. For more information, see Male Breast Cancer Treatment. In this population, screening is unlikely to be beneficial.
Mammography uses ionizing radiation to image breast tissue. The examination is performed by compressing the breast firmly between two plates, which spreads out overlapping tissues and reduces the amount of radiation needed for the image. For routine screening in the United States, examinations are taken in both mediolateral oblique and craniocaudal projections.[1] Both views will include breast tissue from the nipple to the pectoral muscle. Radiation exposure is 4 to 24 mSv per standard two-view screening examination. Two-view examinations have a lower recall rate than single-view examinations because they reduce concern about abnormalities caused by superimposition of normal breast structures.[2] Two-view exams have lower interval cancer rates than single-view exams.[3]
Under the Mammography Quality Standards Act (MQSA) enacted by Congress in 1992, all U.S. facilities that perform mammography must be certified by the U.S. Food and Drug Administration (FDA) to ensure the use of standardized training for personnel and a standardized mammography technique utilizing a low radiation dose.[4] (See the FDA's web page on Mammography Facility Surveys, Mammography Equipment Evaluations, and Medical Physicist Qualification Requirement under MQSA.) The 1998 MQSA Reauthorization Act requires that patients receive a written lay-language summary of mammography results.
The following Breast Imaging Reporting and Data System (BI-RADS) categories are used for reporting mammographic results:[5]
Most screening mammograms are interpreted as negative or benign (BI-RADS 1 or 2, respectively); about 10% of women in the United States are asked to return for additional evaluation.[6] The percentage of women asked to return for additional evaluation varies not only by the inherent characteristics of each woman but also by the mammography facility and radiologist.[7]
Tumor detection has not been validated as a proper surrogate outcome measure for breast cancer mortality, and novel screening methods that simply increase tumor detection rates may not necessarily reduce the risk of dying from breast cancer. Nonetheless, there are numerous studies demonstrating improvements in breast tumor detection rates with modern imaging technology, with the absence of mortality data. Between 1963 and 1990, screening mammography was assessed in nine randomized trials with breast cancer-specific mortality as the primary end point, and screening mammography recommendations were largely based on the results of these trials. However, in more recent years, novel breast screening technologies have often been assessed in clinical trials and observational studies with end points that have not been validated as proper surrogate outcome measures for breast cancer mortality.[8]
A systematic review of studies with a total of 488,099 patients compared digital breast tomosynthesis (DBT) alone, combined DBT and digital mammography (DM), and DM alone. DBT alone and combined DBT and DM were more sensitive than DM alone for breast cancer detection, but there appeared to be no significant difference in diagnostic accuracy between DBT alone and the combination of DBT and DM. A subsequent systematic review and meta-analysis by the same authors seemed to support the replacement of DM by synthetic 2-dimensional mammography (S2D) combined with DBT for breast cancer screening, as combining S2D and DBT improved tumor detection rates, and reduced recall rates, radiation dose, and overall costs.[8-10]
DM is more expensive than screen-film mammography (SFM) but is more amenable to data storage and sharing. Performance of both SFM and DM for cancer detection rate, sensitivity, specificity, and positive predictive value (PPV) has been compared directly in several trials, with similar results in most patient groups.
The Digital Mammographic Imaging Screening Trial (DMIST) compared the findings of digital and film mammograms in 42,760 women at 33 U.S. centers. Although DM detected more cancers in women younger than 50 years (area under the curve [AUC] of 0.84 +/- 0.03 for digital; AUC of 0.69 +/- 0.05 for film; P = .002), there was no difference in breast cancer detection overall.[11] A second DMIST report found a trend toward higher AUC for film mammography than for DM in women aged 65 years and older.[12]
Another large U.S. cohort study [13] also found slightly better sensitivity for film mammography for women younger than 50 years with similar specificity.
A Dutch study compared the findings of 1.5 million digital versus 4.5 million screen-film screening mammograms performed between 2004 and 2010. A higher recall and cancer detection rate was observed for the digital screens.[14] A meta-analysis [15] of 10 studies, including the DMIST [11,12] and the U.S. cohort study,[13] compared DM and film mammography in 82,573 women who underwent both types of the exam. In a random-effects model, there was no statistically significant difference in cancer detection between the two types of mammography (AUC of 0.92 for film and AUC of 0.91 for digital). For women younger than 50 years, all studies found that sensitivity was higher for DM, but specificity was either the same or higher for film mammography.
Computer-aided detection (CAD) systems highlight suspicious regions, such as clustered microcalcifications and masses,[16] generally increasing sensitivity, decreasing specificity,[17] and increasing detection of ductal carcinoma in situ (DCIS).[18] Several CAD systems are in use. One large population-based study that compared recall rates and breast cancer detection rates before and after the introduction of CAD systems, found no change in either rate.[16,19] Another large study noted an increase in recall rate and increased DCIS detection but no improvement in invasive cancer detection rate.[18,20] Another study, using a large database and DM in women aged 40 to 89 years, found that CAD did not improve sensitivity, specificity, or detection of interval cancers, but it did detect more DCIS.[21]
The use of new screening mammography modalities by more than 270,000 women aged 65 years and older in two time periods, 2001 to 2002 and 2008 to 2009, was examined, relying on a Surveillance, Epidemiology, and End Results (SEER)–Medicare-linked database. DM increased from 2% to 30%, CAD increased from 3% to 33%, and spending increased from $660 million to $962 million. CAD was used in 74% of screening mammograms paid for by Medicare in 2008, almost twice as many screening mammograms as in 2004. There was no difference in detection rates of early-stage (DCIS or stage I) or late-stage (stage IV) tumors.[22]
DBT is a mammographic technique, which was approved by the FDA (April 2018).[23] Like conventional mammography, DBT compresses the breast and uses x-rays to create images. In DBT, an x-ray tube moves in an arc around the compressed breast, taking multiple images at different angles, which are then reconstructed or synthesized into a set of 3-dimensional images by a computer. Some cancers are better seen with this method than on conventional DM or ultrasound.
DBT has rapidly become a prominent method of breast cancer screening in the United States, especially in higher-income regions with larger White populations. Use of DBT for breast cancer screening increased from 13% in 2015 to over 40% in 2017.[24] Seventy-three percent of facilities now report use of DBT.[23]
Observational data from eight screening facilities in Vermont compared the findings from 86,379 DBT and 97,378 full-field DM screening examinations performed between 2012 and 2016. Women were included if they had no history of breast cancer or breast implants. Demographic and risk factor information was obtained by questionnaire, and pathology for all biopsies was obtained through the Vermont Breast Cancer Surveillance System. Recall rate was lower with DBT than with DM (7.9% vs. 10.9%; odds ratio [OR], 0.81; 95% confidence interval [CI], 0.77–0.85), but there was no difference in the rates of biopsy or the detection of benign or malignant disease.[25]
The Oslo Tomosynthesis Screening Trial was conducted between November 2010 and December 2012 and included 24,301 women with 281 cancers. The trial compared the sensitivity of DM with DM plus DBT and with DM plus computer-aided detection and of DM plus DBT with synthesized 2-dimensional mammography plus DBT. Researchers report that DBT plus DM detected more breast cancers than DM alone (230 vs. 177, a 22.7% relative increase [95% CI, 17%–28.6%]). The trial also reported somewhat fewer false-positive findings on DBT plus DM compared with DM alone (2,081 vs. 2,466, a 0.8% relative reduction [95% CI, -1.03 to -0.57]), except in women with extremely dense breasts.[26] Difference between CAD plus DM and DM alone were not statistically significant.
The Tomosynthesis Trial in Bergen (To-Be) compared DBT plus synthesized mammography (SM) with conventional DM in population-based screening, including all women aged 50 to 69 years who were invited for breast cancer screening in Bergen, Norway. Screening was performed with two-view DBT plus SM or two-view conventional DM. A pool of eight radiologists independently double read the screening mammograms. Interim results from the first year of the trial showed:[27]
The primary outcome results were published later.[28] The authors suggested explanations for the difference between these results and those from previous studies. First, SM may produce inferior quality images when compared with conventional DM, including poor visualization of microcalcifications. Second, the eight radiologists had wide variations in experience (ranging from 0–19 years) reading screen film and/or DM and DBT in population-based breast cancer screening.
Another study used three different Cancer Intervention and Surveillance Modeling Network (CISNET) breast cancer models and incorporated DBT screening performance data into the models to determine the cost and benefits of DBT versus DM. The study concluded that the use of DBT screening instead of DM reduced false-positives and recall rates and was projected to reduce breast cancer deaths (0–0.21 deaths per 1,000 women) and increased quality-adjusted life-years (QALYs) (1.97–3.27 per 1,000 women). However, these improvements were generally small and were associated with high costs relative to benefits: cost-effectiveness ratios ranged from $195,026 to $270,135 per QALY gained. These are greater than commonly accepted thresholds of $50,000 to $150,000 per QALY.[29]
An important limitation of the available studies and statistical modeling is lack of evidence of the clinical significance of the additional breast cancers detected by DBT (with or without DM) versus DM alone. The extent to which DBT may contribute to overdiagnosis of non–life-threatening lesions or lesions that would have still been detected in an asymptomatic woman at the time of a future DM is unknown. To date, there are no studies of DBT that show a reduction in metastatic disease or other late-stage disease.
Five ongoing randomized controlled trials with a combined recruitment of 430,000 women in Europe, the United Kingdom, and the United States are expected to provide information about clinical breast cancer outcomes of mammographic screening using DBT compared with DM.[26,30]
The randomized TOSYMA trial assessed DBT plus synthesized mammography versus digital screening mammography alone for the detection of breast cancer. The primary end points were detection of invasive breast cancer and the interval invasive cancer detection rate at 24 months. However, neither of these end points has been validated as proper surrogate outcome measures for mortality. The detection of greater numbers of early-stage cancers may confer no mortality benefit, as many of these cancers may fail to progress or progress so slowly that they pose no threat to the patient’s life (i.e., result in overdiagnosis). Moreover, if the detection of nonlethal cancers substantially increases, then the interval cancer detection rates may decrease with no subsequent reduction in mortality.[8]
A cohort study comparing DBT with DM found that the two modalities were not associated with a significant difference in risk of interval invasive cancer. However, DBT was associated with a significantly lower risk of advanced breast cancer among women with extremely dense breasts at high risk of developing breast cancer.[31] Better clarification on this issue may come from the ongoing Tomosynthesis Mammographic Imaging Screening Trial (TMIST), in which women are randomly assigned to either standard digital breast imaging or DBT, and the primary outcome is rate of advanced cancers, a composite end point that includes distant metastases.
Regardless of stage, nodal status, and tumor size, screen-detected cancers have a better prognosis than those diagnosed outside of screening.[2] This suggests that they are biologically less lethal (perhaps slower growing and less likely to invade locally and metastasize). This is consistent with the length bias effect associated with screening. That is, screening is more likely to detect indolent (i.e., slow-growing) breast cancers, while the more aggressive cancers are detected in the intervals between screening sessions.
A 10-year follow-up study of 1,983 Finnish women with invasive breast cancer demonstrated that the method of cancer detection is an independent prognostic variable. When controlled for age, nodal status, and tumor size, screen-detected cancers had a lower risk of relapse and better overall survival. For women whose cancers were detected outside of screening, the hazard ratio (HR) for death was 1.90 (95% CI, 1.15–3.11), even though they were more likely to receive adjuvant systemic therapy.[32]
Similarly, an examination of the breast cancers found in three randomized screening trials (Health Insurance Plan, National Breast Screening Study [NBSS]-1, and NBSS-2) accounted for stage, nodal status, and tumor size and determined that patients whose cancer was found via screening had a more favorable prognosis. The relative risks (RRs) for death were 1.53 (95% CI, 1.17–2.00) for interval and incident cancers, compared with screen-detected cancers; and 1.36 (95% CI, 1.10–1.68) for cancers in the control group, compared with screen-detected cancers.[33]
A third study compared the outcomes of 5,604 English women with screen-detected cancers to those with symptomatic breast cancers diagnosed between 1998 and 2003. After controlling for tumor size, nodal status, grade, and patient age, researchers found that the women with screen-detected cancers fared better. The HR for survival of the symptomatic women was 0.79 (95% CI, 0.63–0.99).[32,34]
The findings of these studies are also consistent with the evidence that some screen-detected cancers are low risk and represent overdiagnosis.
Numerous uncontrolled trials and retrospective series have documented the ability of mammography to diagnose small, early-stage breast cancers, which have a favorable clinical course.[35] Individuals whose cancer is detected by screening show a higher survival rate than those whose cancers are not detected by screening even when screening has not prolonged any lives. This concept is explained by the following four types of statistical bias:
The impact of these biases is not known. A new randomized controlled trial (RCT) with cause-specific mortality as the end point is needed to determine both survival benefit and impact of overdiagnosis, lead time, length time, and healthy volunteer biases. This is not achievable; randomly assigning patients to screen and nonscreen groups would be unethical, and at least three decades of follow-up would be needed, during which time changes in treatment and imaging technology would invalidate the results. Decisions must therefore be based on available RCTs, despite their limitations, and on ecological or cohort studies with adequate control groups and adjustment for confounding. For more information, see Cancer Screening Overview.
Performance benchmarks for screening mammography in the United States are described on the Breast Cancer Surveillance Consortium (BCSC) website. For more information, see Cancer Screening Overview.
The sensitivity of mammography is the percentage of women with breast cancers detected by mammographic screening. Sensitivity depends on tumor size, conspicuity, hormone sensitivity, breast tissue density, patient age, timing within the menstrual cycle, overall image quality, and interpretive skill of the radiologist. Overall sensitivity is approximately 79% but is lower in younger women and in those with dense breast tissue (see the BCSC website).[37-39] Sensitivity is not the same as benefit because some woman with possible breast cancer are harmed by overdiagnosis. According to the Physician's Insurance Association of America (PIAA), delay in diagnosis of breast cancer and errors in diagnosis are common causes of medical malpractice litigation. PIAA data from 2002 through 2011 note that the largest total indemnity payments for breast cancer claims are for errors in diagnosis.[40]
The specificity of mammography is the percentage of all women without breast cancer whose mammograms are negative. The false-positive rate is the likelihood of a positive test in women without breast cancer. Low specificity and high rate of false-positives result in unnecessary follow-up examinations and procedures. Because specificity includes all women without cancer in the denominator, even a small percentage of false-positives turns out to be a large number in absolute terms. Thus—in screening—a good specificity must be very high. Even 95% specificity is quite low for a screening test.
Interval cancers are cancers that are diagnosed in the interval between a normal screening examination and the anticipated date of the next screening mammogram. One study found interval cancers occurred more often in women younger than 50 years, and had mucinous or lobular histology, high histological grade, high proliferative activity with relatively benign mammographic features, and no calcifications. Conversely, screen-detected cancers often had tubular histology, small size, low stage, hormone sensitivity, and a major component of DCIS.[41] Overall, interval cancers have characteristics of rapid growth,[41,42] are diagnosed at an advanced stage, and carry a poor prognosis.[43]
Analysis of mammography screening length bias preferentially detects indolent cancers that grow more slowly (e.g., exist for a longer length of time in the preclinical phase). In contrast, the more aggressive cancers grow faster (e.g., spend a shorter length of time in the preclinical phase) and are often detected clinically in the intervals between screening sessions. For a more detailed explanation of length and lead-time bias in cancer screening, see Cancer Screening Overview.
In recent years, novel breast cancer screening technologies have been assessed in clinical trials with the interval cancer detection rate as the primary outcome of interest, and newer screening methods recommended on the basis of reductions in interval cancer detection rates. However, the interval cancer detection rate has not been validated as a proper surrogate for breast cancer mortality, and its use as a surrogate outcome measure in breast cancer screening trials remains controversial.
In breast cancer screening programs, screen-detected breast cancers tend to have a better prognosis than cancers detected during the intervals between screening sessions (interval breast cancers). This was confirmed in a registry-based cohort study from Manitoba in which interval breast cancers were more likely than were screen-detected breast cancers to be high-grade and estrogen receptor–negative, and associated with greater than a threefold increased risk of breast cancer death.[44]
The Nova Scotia Breast Screening Program defined missed cancers as those that were false-negatives on the previous screening exam, occurring less often than 1 per 1,000 women. It concluded that interval cancers occurred in approximately 1 per 1,000 women aged 40 to 49 years, and 3 per 1,000 women aged 50 to 59 years.[45]
Conversely, a larger trial found that interval cancers were more prevalent in women aged 40 to 49 years. Those appearing within 12 months of a negative screening mammogram were usually attributable to greater breast density. Those appearing within a 24-month interval were related to decreased mammographic sensitivity caused by greater breast density or to rapid tumor growth.[46]
The accuracy of mammography has been noted to vary with patient characteristics, such as a woman's age, breast density, whether it is her first or subsequent exam, and the time since her last mammogram. Younger women have lower sensitivity and higher false-positive rates than do older women.
The Million Women Study in the United Kingdom found decreased sensitivity and specificity in women aged 50 to 64 years if they used postmenopausal hormone therapy, had prior breast surgery, or had a body mass index below 25.[47] Increased time since the last mammogram increases sensitivity, recall rate, and cancer detection rate and decreases specificity.[48]
The United Kingdom Age Trial assessed the efficacy of mammography screening for women younger than 50 years. After a median follow-up of 22.8 years, there was no difference in breast cancer mortality between women randomly assigned to initiate screening at age 39 to 41 years until entry into the National Health Service (NHS) breast screening program at age 50 to 52 years, versus the group that did not initiate mammography screening until entry into the NHS breast screening program (RR, 0.98; 95% CI, 0.79–1.22; P = .86).[49]
Sensitivity may be improved by scheduling the exam after the initiation of menses or during an interruption from hormone therapy.[50] Obese women have more than a 20% increased risk of having false-positive mammography, although sensitivity is unchanged.[51]
Dense breasts may obscure the detection of small masses on mammography, thereby reducing the sensitivity of mammography.[13] For women of all ages, high breast density is associated with 10% to 29% lower sensitivity.[38] High breast density is also associated with a modestly increased risk of developing breast cancer.[52] High breast density does not confer a higher risk of breast cancer death.
High breast density is an inherent trait, which can be inherited [53,54] or affected by age; endogenous [55] and exogenous [56,57] hormones;[58] selective estrogen receptor modulators, such as tamoxifen;[59] and diet.[60] Hormone therapy is associated with increased breast density, lower mammographic sensitivity, and an increased rate of interval cancers.[61]
Dense breast tissue is not abnormal. Breast density describes the proportion of dense versus fatty tissue in a mammographic image.[62] The American College of Radiology’s BI-RADS classifies breast density as follows:
The latter two categories are considered dense breast tissue, a description affecting 43% of women aged 40 to 74 years.[63] A radiologist's assignment of breast density is subjective and may vary over time in any woman.[63,64]
There is limited high-quality evidence to guide optimal breast cancer screening in individuals with dense breasts. For dense breasts, digital breast tomosynthesis has improved sensitivity and modestly lowers false-positive rates compared with conventional digital mammography.[65]
Supplemental imaging with ultrasonography or breast magnetic resonance imaging (MRI) has been suggested by some groups for screening women with dense breasts, but there are no data showing that this strategy results in lower breast cancer mortality. The potential harm of adding these supplemental screening tests is the likelihood of producing more false-positives, leading to additional imaging and breast biopsies, with resultant anxiety and cost.[66] Supplemental screening may also increase overdiagnosis of breast cancer with resultant overtreatment.
A study examining cancer detection end points in women with dense breasts undergoing supplemental screening (e.g., ultrasound, MRI, digital resources) showed higher breast cancer detection, but it is not known if that translates into cancer protection.[67] An RCT of supplemental MRI versus mammography only in 40,373 individuals aged 50 to 75 years with extremely dense breasts in the Netherlands was performed.[68] The study showed lower incidence of interval cancers at 2 years of follow-up in the MRI group (2.5 per 1,000 screenings in the group invited to receive MRI, 0.8 per 1,000 in the group that actually received MRI, and 5.0 per 1,000 in the group that received mammography only). This finding suggests that at least some of the excess cancers detected by MRI in the MRI group were earlier diagnoses of cancers that would have become clinically apparent. However, whether earlier diagnoses facilitated by MRI resulted in improved clinical outcomes has not been shown. As would be expected, cancers detected by MRI were more likely to have favorable tumor characteristics than interval cancers. MRI screening was associated with 79.8 false-positive results per 1,000 screenings.[68]
A prospective multicenter study, known as the Dense Breast Tomosynthesis Ultrasound Screening Trial (DBTUST), investigated whether ultrasound improved cancer detection after DBT in women with dense breasts.[69] Between December 2015 and June 2021, 6,179 women at three Pennsylvania locations underwent three rounds of annual screening with DBT and technologist-performed handheld ultrasounds. The images were interpreted by two radiologists at baseline, 12 months, and 24 months. The study concluded that technologist-performed ultrasound screening modestly improved detection of cancer in women with dense breasts by 1.3 cases per 1,000 in year 1 and by 1 case per 1,000 in years 2 to 3. This screening also increased the false-positive recall rate. In 3 years, 1,007 (16.3%) women had a false-positive recall based on DBT, and an additional 761 (12.3%) women had a false-positive recall based on ultrasound.
The FDA mandates that mammography facilities report breast density to patients and suggest that patients speak with their primary care clinician about supplemental screening.[70] However, limited evidence, inconsistent guidelines, and wording of breast density reports have generated confusion and anxiety among patients and health care providers.[71]
Mucinous and lobular cancers are more easily detected by mammography. Rapidly growing cancers can sometimes be mistaken for normal breast tissue (e.g., medullary carcinomas, an uncommon type of invasive ductal breast cancer that is often associated with the BRCA1 mutation and aggressive characteristics, but that may demonstrate comparatively favorable responses to treatment).[41,72] Some other cancers associated with BRCA1/2 mutations, which may appear indolent, can also be missed.[73,74]
Radiologists’ performance is variable, affected by levels of experience and the volume of mammograms they interpret.[75] Biopsy recommendations of radiologists in academic settings have a higher positive PPV than do community radiologists.[76] Fellowship training in breast imaging may improve detection.[11]
Performance also varies by facility. Mammographic screening accuracy was higher at facilities offering only screening examinations than at those also performing diagnostic tests. Accuracy was also better at facilities with a breast imaging specialist on staff, performing single rather than double readings, and reviewing performance audits two or more times each year.[77]
False-positive rates are higher at facilities where concern about malpractice is high and at facilities serving vulnerable women (racial or ethnic minority women and women with less education, limited household income, or rural residence).[78] These populations may have a higher cancer prevalence and a lack of follow-up.[79]
Artificial intelligence (AI) algorithms are being developed to interpret screening mammograms and breast biopsy specimens.[80-82] While such tools may improve interpretive speed and reproducibility in the future, it is unknown if they will exacerbate overdiagnosis [83] and how they might influence physicians’ final assessments.
International comparisons of screening mammography have found higher specificity in countries with more highly centralized screening systems and national quality assurance programs.[84,85]
The recall rate in the United States is twice that of the United Kingdom, with no difference in the rate of cancer detection.[84]
The likelihood of diagnosing cancer is highest with the prevalent (first) screening examination, ranging from 9 to 26 cancers per 1,000 screens, depending on the woman’s age. The likelihood decreases for follow-up examinations, ranging from 1 to 3 cancers per 1,000 screens.[86]
The optimal interval between screening mammograms is unknown; there is little variability across the trials despite differences in protocols and screening intervals. A prospective U.K. trial randomly assigned women aged 50 to 62 years to receive mammograms annually or triennially. Although tumor grade and nodal status were similar in the two groups, more cancers of slightly smaller size were detected in the annual screening group than in the triennial screening group.[87]
A large observational study found a slightly increased risk of late-stage disease at diagnosis for women in their 40s who were adhering to a 2-year versus a 1-year schedule (28% vs. 21%; OR, 1.35; 95% CI, 1.01–1.81), but no difference was seen for women in their 50s or 60s based on schedule difference.[88,89]
A Finnish study of 14,765 women aged 40 to 49 years randomly assigned women to receive either annual screens or triennial screens. There were 18 deaths from breast cancer in 100,738 life-years in the triennial screening group and 18 deaths from breast cancer in 88,780 life-years in the annual screening group (HR, 0.88; 95% CI, 0.59–1.27).[90]
RCTs that studied the effect of screening mammography on breast cancer mortality were performed between 1963 and 2015, with participation by over half-a-million women in four countries. One trial, the Canadian NBSS-2, compared mammography plus clinical breast examination (CBE) to CBE alone; the other trials compared screening mammography with or without CBE to usual care. For a detailed description of the trials, see the Appendix of Randomized Controlled Trials section.
The trials differed in design, recruitment of participants, interventions (both screening and treatment), management of the control group, compliance with assignment to screening and control groups, and analysis of outcomes. Some trials used individual randomization, while others used cluster randomization in which cohorts were identified and then offered screening; one trial used nonrandomized allocation by day of birth in any given month. Cluster randomization sometimes led to imbalances between the intervention and control groups. Age differences have been identified in several trials, although the differences had no major effect on the trial outcome.[91] In the Edinburgh Trial, socioeconomic status, which correlates with the risk of breast cancer mortality, differed markedly between the intervention and control groups, rendering the results uninterpretable.
Breast cancer mortality was the major outcome parameter for each of these trials, so the attribution of cause of death required scrupulous attention. The use of a blinded monitoring committee (New York) and a linkage to independent data sources, such as national mortality registries (Swedish trials), were incorporated but could not ensure impartial attributions of cancer death for women in the screening or control arms. Possible misclassification of breast cancer deaths in the Two-County Trial biasing the results in favor of screening has been suggested.[92]
There were also differences in the methodology used to analyze the results of these trials. Four of the five Swedish trials were designed to include a single screening mammogram in the control group and were timed to correspond with the end of the series of screening mammograms in the study group. The initial analysis of these trials used an evaluation analysis, tallying only the breast cancer deaths that occurred in women whose cancer was discovered at or before the last study mammogram. In some of the trials, a delay occurred in the performance of the end-of-study mammogram, resulting in more time for members of the control group to develop or be diagnosed with breast cancer. Other trials used a follow-up analysis, which counts all deaths attributed to breast cancer, regardless of the time of diagnosis. This type of analysis was used in a meta-analysis of four of the five Swedish trials as a response to concerns about the evaluation analyses.[92]
The accessibility of the data for international audits and verification also varied, with a formal audit having been undertaken only in the Canadian trials. Other trials have been audited to varying degrees, but with less rigor.[93]
All of these studies were designed to study breast cancer mortality rather than all-cause mortality because breast cancer deaths contribute only a small proportion of total mortality in any given population. When all-cause mortality in these trials was examined retrospectively, only the Edinburgh Trial showed a difference attributable to the previously noted socioeconomic differences in the study groups. The meta-analysis (follow-up methods) of the four Swedish trials also showed a small improvement in all-cause mortality.
The relative improvement in breast cancer mortality attributable to screening is approximately 15% to 20%, and the absolute improvement at the individual level is much less. The potential benefit of breast cancer screening can be expressed as the number of lives extended because of early breast cancer detection.[94,95]
The RCT results represent experiences in a defined period of regular examinations, but in practice, women undergo 20 to 30 years of screening throughout their lifetimes.[89,96]
There are several problems with using these RCTs that were performed up to 50 years ago to estimate the current benefits of screening on breast cancer mortality. These problems include the following:
For these reasons, estimates of the breast cancer mortality reduction resulting from current screening are based on well-conducted cohort and ecological studies in addition to the RCTs.
An estimate of screening effectiveness can be obtained from nonrandomized controlled studies of screened versus nonscreened populations, case-control studies of screening in real communities, and modeling studies that examine the impact of screening on large populations. These studies must be designed to minimize or exclude the effects of unrelated trends influencing breast cancer mortality such as improved treatment and heightened awareness of breast cancer in the community.
Three population-based, observational studies from Sweden compared breast cancer mortality in the presence and absence of screening mammography programs. One study compared two adjacent time periods in 7 of the 25 counties in Sweden and found a statistically significant breast cancer mortality reduction of 18% to 32% attributable to screening.[97] The most important bias in this study is that the advent of screening in these counties occurred over a period during which dramatic improvements in the effectiveness of adjuvant breast cancer therapy were being made, changes that were not addressed by the study authors. The second study considered an 11-year period comparing seven counties with screening programs with five counties without them.[98] There was a trend in favor of screening, but again, the authors did not consider the effect of adjuvant therapy or differences in geography (urban vs. rural) that might affect treatment practices.
The third study attempted to account for the effects of treatment by using a detailed analysis by county. It found screening had little impact, a conclusion weakened by several flaws in design and analysis.[99]
In Nijmegen, the Netherlands, where a population-based screening program was undertaken in 1975, a case-cohort study found that screened women had decreased mortality compared with unscreened women (OR, 0.48).[100] However, a subsequent study comparing Nijmegen breast cancer mortality rates with neighboring Arnhem in the Netherlands, which had no screening program, showed no difference in breast cancer mortality.[101]
A community-based case-control study of screening in high-quality U.S. health care systems between 1983 and 1998 found no association between previous screening and reduced breast cancer mortality, but the mammography screening rates were generally low.[102]
A well-conducted ecological study compared three pairs of neighboring European countries that were matched on similarity in health care systems and population structure, one of which had started a national screening program some years earlier than the others. The investigators found that each country had experienced a reduction in breast cancer mortality, with no difference between matched pairs that could be attributed to screening. The authors suggested that improvements in breast cancer treatment and/or health care organizations were more likely responsible for the reduction in mortality than was screening.[103]
A systematic review of ecological and large cohort studies published through March 2011 compared breast cancer mortality in large populations of women, aged 50 to 69 years, who started breast cancer screening at different times. Seventeen studies met inclusion criteria, but all studies had methodological problems, including control group dissimilarities, insufficient adjustment for differences between areas in breast cancer risk and breast cancer treatment, and problems with similarity of measurement of breast cancer mortality between compared areas. There was great variation in results among the studies, with four studies finding a relative reduction in breast cancer mortality of 33% or more (with wide CIs) and five studies finding no reduction in breast cancer mortality. Because only a part of the overall reduction in breast cancer mortality could possibly be attributed to screening, the review concluded that any relative reduction in breast cancer mortality resulting from screening would likely be no more than 10%.[104]
A U.S. ecological analysis conducted between 1976 and 2008 examined the incidence of early-stage versus late-stage breast cancer for women aged 40 years and older. To assess a screening effect, the authors compared the magnitude of increase in early-stage cancer with the magnitude of an expected decrease in late-stage cancer. Over the study, the absolute increase in the incidence of early-stage cancer was 122 cancers per 100,000 women, while the absolute decrease in late-stage cancers was 8 cases per 100,000 women. After adjusting for changes in incidence resulting from hormone therapy and other undefined causes, the authors concluded (1) the benefit of screening on breast cancer mortality was small, (2) between 22% and 31% of diagnosed breast cancers represented overdiagnosis, and (3) the observed improvement in breast cancer mortality was probably attributable to improved treatment rather than screening.[105]
An analytic approach was used to approximate the contributions of screening versus treatment to breast cancer mortality reduction and the magnitude of overdiagnosis.[106] The shift in the size distribution of breast cancers in the United States (before the introduction of mammography) to 2012 (after its widespread dissemination), was investigated using SEER data in women aged 40 years and older. The rate of clinically meaningful breast cancer was assumed to be stable during this time. The authors documented a lower incidence of larger (≥2 cm) tumors as well as a reduction in breast cancer case fatality. The lower mortality for women with larger tumors was attributed to improvements in therapy. Two-thirds of the decline in size-specific case fatality was ascribed to improved treatment.
A prospective cohort study of community-based screening programs in the United States found that annual compared with biennial screening mammography did not reduce the proportion of unfavorable breast cancers detected in women aged 50 to 74 years or in women aged 40 to 49 years without extremely dense breasts. Women aged 40 to 49 years with extremely dense breasts did have a reduction in cancers larger than 2.0 cm with annual screening (OR, 2.39; 95% CI, 1.37–4.18).[107]
An observational study of women aged 40 to 74 years conducted in 7 of 12 Canadian screening programs compared breast cancer mortality in those participants screened at least once between 1990 and 2009 (85% of the population) with those not screened (15% of the population). The abstract reported a 40% average breast cancer mortality among participants; however, it was likely intended to report a 40% reduction in breast cancer mortality on the basis of language used in the Discussion section.[108]
Limitations of this study included the lack of all-cause mortality data, the extent of screening, screening outside of the study, screening prior to the study, the method used for calculating expected mortality and the referent rates of nonparticipants, nonparticipant survival, province-specific population differences, the extent to which limitations of the database prevented correcting for age and other differences between participants, the generalizability of the substudy data of a single province (British Columbia), and the potentially large impact of selection bias. Overall, the study lacked important data and had limitations in methodology and data analysis.
The optimal screening interval has been addressed by modelers. Modeling makes assumptions that may not be correct; however, the credibility of modeling is greater when the model produces overall results that are consistent with randomized trials and when the model is used to interpolate or extrapolate. For example, if a model’s output agrees with RCT outcomes for annual screening, it has greater credibility to compare the relative effectiveness of biennial versus annual screening.
In 2000, the National Cancer Institute formed a consortium of modeling groups (Cancer Intervention and Surveillance Modeling Network [CISNET]) to address the relative contribution of screening and adjuvant therapy to the observed decline in breast cancer mortality in the United States.[109] These models predicted reductions in breast cancer mortality similar to those expected in the circumstances of the RCTs but updated to the use of modern adjuvant therapy. In 2009, CISNET modelers addressed several questions related to the harms and benefits of mammography, including comparing annual versus biennial screening.[89] Women aged 50 to 74 years received most of the mortality benefit of annual screening by having a mammogram every 2 years. The reduction in breast cancer deaths that was maintained because of the move from annual to biennial screening ranged across the six models from 72% to 95%, with a median of 80%.
Data are limited as to how much of the reduction in mortality, seen over time from 1990 onward, is attributable to advances in imaging techniques for screening and as to how much is the result of the improved effectiveness of therapy. In one CISNET study of six simulation models, about one-third of the decrease in breast cancer mortality in 2012 was attributable to screening, with the balance attributed to treatment.[110] In this CISNET study, the mean estimated reduction in overall breast cancer mortality rate was 49% (model range, 39%–58%), relative to the estimated baseline rate in 2012 if there was no screening or treatment; 37% (model range, 26%–51%) of this reduction was associated with screening, and 63% (model range, 49%–74%) of this reduction was associated with treatment.
The negative effects of screening mammography are overdiagnosis (true positives that will not become clinically significant), false-positives (related to the specificity of the test), false-negatives (related to the sensitivity of the test), discomfort associated with the test, radiation risk, psychological harm, financial stress, and opportunity costs.
Table 1 provides an overview of the estimated benefits and harms of screening mammography for 10,000 women who underwent annual screening mammography over a 10-year period.[111]
Age, y | No. of Breast Cancer Deaths Averted With Mammography Screening During the Next 15 y b | No. (95% CI) With ≥1 False-Positive Result During the 10 y c | No. (95% CI) With ≥1 False-Positive Resulting in a Biopsy During the 10 y c | No. of Breast Cancers or DCIS Diagnosed During the 10 y That Would Never Become Clinically Important (Overdiagnosis) d | |
---|---|---|---|---|---|
No. = number; CI = confidence interval; DCIS = ductal carcinoma in situ. | |||||
a Adapted from Pace and Keating.[111] | |||||
b Number of deaths averted are from Welch and Passow.[112] The lower bound represents breast cancer mortality reduction if the breast cancer mortality relative risk were 0.95 (based on minimal benefit from the Canadian trials [113,114]), and the upper bound represents the breast cancer mortality reduction if the relative risk were 0.64 (based on the Swedish 2-County Trial [115]). | |||||
c False-positive and biopsy estimates and 95% confidence intervals are 10-year cumulative risks reported in Hubbard et al. [116] and Braithwaite et al.[117] | |||||
d The number of overdiagnosed cases are calculated by Welch and Passow.[112] The lower bound represents overdiagnosis based on results from the Malmö trial,[118] whereas the upper bound represents the estimate from Bleyer and Welch.[105] | |||||
e The lower-bound estimate for overdiagnosis reported by Welch and Passow [112] came from the Malmö study.[118] The study did not enroll women younger than 50 years. | |||||
40 | 1–16 | 6,130 (5,940–6,310) | 700 (610–780) | ?–104 e | |
50 | 3–32 | 6,130 (5,800–6,470) | 940 (740–1,150) | 30–137 | |
60 | 5–49 | 4,970 (4,780–5,150) | 980 (840–1,130) | 64–194 |
Overdiagnosis occurs when screening procedures detect cancers that would never become clinically apparent in the absence of screening. It is a special concern because identification of the cancer does not benefit the individual, while the side effects of diagnostic procedures and cancer treatment may cause significant harm. The magnitude of overdiagnosis is debated, particularly regarding DCIS, a cancer precursor whose natural history is unknown. By reason of this inability to predict confidently the tumor behavior at time of diagnosis, standard treatment for invasive cancers and DCIS can cause overtreatment. The related harms include treatment-related side effects and the number of harms associated with a cancer diagnosis, which are immediate. Conversely, a mortality benefit would occur at an uncertain point in the future.
One approach to understanding overdiagnosis is to examine the prevalence of occult cancer in women who died of noncancer causes. In an overview of seven autopsy studies, the median prevalence of occult invasive breast cancer was 1.3% (range, 0%–1.8%) and of DCIS was 8.9% (range, 0%–14.7%).[119,120]
Overdiagnosis can be indirectly measured by comparing breast cancer incidence in screened versus unscreened populations. These comparisons can be confounded by differences in the populations, such as time, geography, health behaviors, and hormone usage. The calculations of overdiagnosis can vary in their adjustment for lead-time bias.[121,122] An overview of 29 studies found calculated rates of overdiagnosis to be 0%–54%, with rates from randomized studies between 11% and 22%.[123] In Denmark, where screened and unscreened populations existed concurrently, the rate of overdiagnosis of invasive cancer was calculated to be 14% and 39%, using two different methodologies. If DCIS cases were included, the overdiagnosis rates were 24% and 48%. The second methodology accounts for regional differences in women younger than the screening age and is likely more accurate.[124]
Theoretically, in a given population, the detection of more breast cancers at an early stage would result in a subsequent reduction in the incidence of advanced-stage cancers. This has not occurred in any of the populations studied to date. Thus, the detection of more early-stage cancers likely represents overdiagnosis. A population-based study in the Netherlands showed that about one-half of all screen-detected breast cancers, including DCIS, would represent overdiagnosis and is consistent with other studies, which showed substantial rates of overdiagnosis associated with screening.[125]
A cohort study in Norway compared the increase in cancer incidence in women who were eligible for screening with the cancer incidence in younger women who were not eligible for screening, eligibility was based on age and residence. Eligible women experienced a 60% increase in incidence of localized cancers (RR, 1.60; 95% CI, 1.42–1.79), while the incidence of advanced cancers remained similar in the two groups (RR, 1.08; 95% CI, 0.86–1.35).[126]
A population study that compared different counties in the United States showed that higher rates of screening mammography use were associated with higher rates of breast cancer diagnoses, yet there was no corresponding decrease in 10-year breast cancer mortality.[127] The strengths of this study include its very large size (16 million women) and the strength and consistency of correlation observed across counties. The limitations of this study include the self-reporting of mammograms, the use of a 2-year window to estimate screening prevalence, and the period of analysis (when menopausal hormone use was present).[127]
The extent of overdiagnosis has been estimated in the Canadian NBSS, a randomized clinical trial. At the end of the five screening rounds, 142 more invasive breast cancer cases were diagnosed in the mammography arm, compared with the control arm.[128] At 15 years, the excess number of cancer cases in the mammography arm versus the control arm was 106, representing an overdiagnosis rate of 22% for the 484 screen-detected invasive cancers.[128]
As a consequence of screening mammography, greater numbers of breast cancers with indolent behavior are now identified, resulting in potential overtreatment. In a secondary analysis of a randomized trial of tamoxifen versus no systemic therapy in patients with early breast cancer, the authors utilized the 70-gene MammaPrint assay and identified 15% of patients at ultra-low risk, with 20-year disease-specific survival rates of 97% in the tamoxifen group and 94% in the control group. Thus, these patients would likely have extremely good outcomes with surgery alone. The frequency of such ultra-low risk cancers in the screened population is likely around 25%. Tools such as the 70-gene MammaPrint assay might be utilized in the future to identify these cancers, and thereby, reduce the risk of overtreatment. However, additional studies are needed to confirm these findings.[129]
In 2016, the Canadian NBSS, a randomized screening trial with 25-year follow-up, re-estimated overdiagnosis of breast cancer from mammography screening by age group and concluded that approximately 30% of invasive screen-detected cancers in women aged 40 to 49 years and up to 20% of those detected in women aged 50 to 59 years were overdiagnosed. When in situ cancers are included, the estimated risks of overdiagnosis are 40% aged 40 to 49 years and 30% in women aged 50 to 59 years. Overdiagnosis was calculated as the persistent excess incidence in the screened arm versus the control arm divided by the number of screen-detected cases (excess incidence method). Requirements for adequate estimation of overdiagnosis utilizing this method included the following:
These conditions were largely met in the CNBSS because population-based screening did not become available throughout Canada until a minimum of 2 years later and in most instances 5 to 10 years later (thereby, allowing for cessation of screening after the trial screening period and follow-up longer than most estimates of lead time), because contamination is documented to have been minimal, and because individual randomization resulted in 44 almost identically distributed demographic factors and risk factors between the two trial arms.
Since the conclusion of the trial screening period in 1988, differences in screening quality, intensity, invited age range, and biopsy thresholds decrease the generalizability of these results. These factors and improved imaging technique/quality and low threshold for biopsy, likely contribute to lower estimates of overdiagnosis of in situ cancer than that of invasive cancer.[130]
Table 1 shows results from a 10-year period of screening 10,000 women, estimating the number of women with breast cancer or DCIS that would never become clinically important (overdiagnosis). There was likely no overdiagnosis in the Health Insurance Plan study, which used old-technology mammography and CBE. Overdiagnosis has become more prominent in the era of improved-technology mammography. The improved technology has not, however, been shown to make further reductions in mortality than the original technology. In summary, breast cancer overdiagnosis is a complex topic. Studies that used many different methods reported a wide range of estimates, and there is currently no way to assess whether new cancer cases are overdiagnosed or are of real harm to patients.[111]
Because fewer than 5 per 1,000 women screened have breast cancer, most abnormal mammograms are false-positives, even given the 90% specificity of mammography (i.e., 90% of all women without breast cancer will have a negative mammogram).[86]
This high false-positive rate of mammography is underestimated and can seem counterintuitive because of a statistically based cognitive bias known as the base rate fallacy. Because the base rate of breast cancer is low, (5/1000), the false-positive rate vastly exceeds the true-positive rate, even when using a very accurate test.
Mammography’s true-positive rate of approximately 90% means that, of women with breast cancer, approximately 90% will test positive. The true-negative rate of 90% means that, of women without breast cancer, 90% will test negative. A 10% false-positive rate over 1,000 people means that there will be 100 false-positives in 1,000 people. If 5 in 1,000 women have breast cancer, then 4.5 women with breast cancer will have a positive test. In other words, there will approximately 100 false-positives for every 4.5 true positives.
Further, abnormal results from screening mammograms prompt additional tests and procedures, such as mammographic views of the region of concern, ultrasound, MRI, and tissue sampling (by fine-needle aspiration, core biopsy, or excisional biopsy). Overall, the harm from unnecessary tests and treatments must be weighed against the benefit of early detection.
A study of breast cancer screening in 2,400 women enrolled in a health maintenance organization found that over a decade, 88 cancers were diagnosed, 58 of which were identified by mammography. One-third of the women had an abnormal mammogram result that required additional testing: 539 additional mammograms, 186 ultrasound examinations, and 188 biopsies. The cumulative biopsy rate (the rate of true positives) resulting from mammographic findings was approximately 1 in 4 (23.6%). The PPV of an abnormal screening mammogram in this population was 6.3% for women aged 40 to 49 years, 6.6% for women aged 50 to 59 years, and 7.8% for women aged 60 to 69 years.[131] A subsequent analysis and modeling of data from the same cohort of women, estimated that the risk of having at least one false-positive mammogram was 7.4% (95% CI, 6.4%–8.5%) at the first mammogram, 26.0% (95% CI, 24.0%–28.2%) by the fifth mammogram, and 43.1% (95% CI, 36.6%–53.6%) by the ninth mammogram.[132] Cumulative risk of at least one false-positive result depended on four patient variables (younger age, higher number of previous breast biopsies, family history of breast cancer, and current estrogen use) and three radiologic variables (longer time between screenings, failure to compare the current and previous mammograms, and the individual radiologist’s tendency to interpret mammograms as abnormal). Overall, the factor most responsible for a false-positive mammogram was the individual radiologist’s tendency to read mammograms as abnormal.
A prospective cohort study of community-based screening found that a greater proportion of women undergoing annual screening had at least one false-positive screen after 10 years than did women undergoing biennial screening, regardless of breast density. For women with scattered fibroglandular densities, the difference was 68.9% (annual) versus 46.3% (biennial) for women in their 40s. For women aged 50 to 74 years, the difference for this density group was 49.8% (annual) versus 30.7% (biennial).[107]
As shown in Table 1, the estimated number of women out of 10,000 who underwent annual screening mammography during a 10-year period with at least one false-positive test result is 6,130 for women aged 40 to 50 years and 4,970 for women aged 60 years. The number of women with a false-positive test that results in a biopsy is estimated to range from 700 to 980, depending on age.[111]
A longitudinal Norwegian study correlated benign abnormal screening results with long-term breast cancer outcomes. Women with any abnormal screening examination had an increased risk of subsequent breast cancer, despite a negative evaluation (see Table 2). The features of the subsequent breast cancer were more favorable for the women who had prior screening abnormalities, possibly because the preexisting breast abnormality was a marker for slow-growing premalignant disease.[133]
Screening Result | Absolute Risk per 1,000 Women-Years | Relative Risk vs. Women Who Screened Negative |
---|---|---|
Benign with additional imaging | 4.4 | 1.8 |
Negative biopsy | 4.7 | 2.0 |
Atypia | 6.9 | 2.9 |
In situ cancer | 9.5 | 3.8 |
The sensitivity of mammography ranges from 70% to 90%, depending on characteristics of the interpreting radiologist (level of experience) and characteristics of the woman (age, breast density, hormone status, and diet). Assuming an average sensitivity of 80%, mammograms will miss approximately 20% of the breast cancers that are present at the time of screening (false-negatives). Many of these missed cancers are high risk, with adverse biological characteristics. If a normal mammogram dissuades or postpones a woman or her doctor from evaluating breast symptoms, she may suffer adverse consequences. Thus, a negative mammogram should never dissuade a woman or her physician from additional evaluation of breast symptoms.
Positioning of the woman and breast compression reduce motion artifact and improve mammogram image quality. Pain and/or discomfort was reported by 90% of women undergoing mammography, with 12% of women rating the sensation as intense or intolerable.[134] A systematic review of 22 studies investigating mammography-associated pain and discomfort found wide variations, some of which were associated with menstrual cycle stage, anxiety, and premammography anticipation of pain.[135]
The major risk factors for radiation-associated breast cancer are young age at exposure and dose; however, rarely there are women with an inherited susceptibility to radiation-induced damage who must avoid radiation exposure at any age.[136,137] For many women older than 40 years, the likely benefits of screening mammography outweigh the risks.[138,136,139] Standard two-view screening mammography exposes the breasts to a mean dose of 4 mSv, and the whole body to 0.29 mSv.[137,140] Thus, up to one breast cancer may be induced per 1,000 women undergoing annual mammograms from ages 40 to 80 years. Such risk is doubled in women with large breasts who require increased radiation doses and in women with breast augmentation who require additional views. Radiation-induced breast cancers may be reduced fivefold for women who begin biennial screening at age 50 years rather than annually at age 40 years.[141]
A telephone survey of 308 women performed 3 months after screening mammography revealed that about one-fourth of the 68 women recalled for additional testing were still experiencing worry that affected their mood or functioning, even though that testing had ruled out cancer.[142] Research into whether the psychological impact of a false-positive test is long-standing yields mixed results. A cohort study in Spain in 2002 found immediate psychological impact to a woman after receiving a false-positive mammogram, but these results dissipated within a few months.[143] A cohort study in Denmark in 2013 that measured the psychological effects of a false-positive test result several years after the event found long-term negative psychological consequences.[144] Several studies have shown that the anxiety after evaluation of a false-positive test leads to increased participation in future screening examinations.[145-148]
These potential harms of screening have not been well researched, but it is clear that they exist.
Ultrasound
Ultrasound is used for the diagnostic evaluation of palpable or mammographically identified masses, rather than serving as a primary screening modality. A review of the literature and expert opinion by the European Group for Breast Cancer Screening concluded that “there is little evidence to support the use of ultrasound in population breast cancer screening at any age.”[1] The Japan Strategic Anti-cancer Randomized Trial (J-START) is a screening trial that randomly assigned women aged 40 to 49 years to either mammography and ultrasound screening (intervention group) or mammography screening alone (control group). The initial results of this trial indicated that supplemental screening with ultrasound (i.e., mammography + ultrasound versus mammography alone) increased the detection rate of early-stage breast cancers, but its effect on mortality is not clear at this time.[2]
Breast MRI
Breast MRI is used in women for diagnostic evaluation, including evaluating the integrity of silicone breast implants, assessing palpable masses after surgery or radiation therapy, detecting mammographically and sonographically occult breast cancer in patients with axillary nodal metastasis, and preoperative planning for some patients with known breast cancer. There is no ionizing radiation exposure with this procedure. MRI has been promoted as a screening test for breast cancer among women at elevated risk of breast cancer based on BRCA1/2 mutation carriers, a strong family history of breast cancer, or several genetic syndromes, such as Li-Fraumeni syndrome or Cowden disease.[3-5] Breast MRI is more sensitive but less specific than screening mammography [6,7] and is up to 35 times as expensive.[8-12]
Thermography
Using infrared imaging techniques, thermography of the breast identifies temperature changes in the skin as a possible indicator of an underlying tumor, displaying these changes in color patterns. Thermographic devices have been approved by the U.S. Food and Drug Administration under the 510(k) process, but no randomized trials have compared thermography to other screening modalities. Small cohort studies do not suggest any additional benefit for the use of thermography as an adjunct modality.[13,14]
The effect of screening clinical breast examination (CBE) on breast cancer mortality has not been fully established. The Canadian National Breast Screening Study (CNBSS) compared high-quality CBE plus mammography with CBE alone in women aged 50 to 59 years. CBE, lasting 5 to 10 minutes per breast, was conducted by trained health professionals, with periodic evaluations of performance quality. The frequency of cancer diagnosis, stage, interval cancers, and breast cancer mortality were similar in the two groups and similar to outcomes with mammography alone.[1] With a mean follow-up of 13 years, breast cancer mortality was similar in the two groups (mortality rate ratio, 1.02; 95% confidence interval [CI], 0.78–1.33).[2] The investigators estimated the operating characteristics for CBE alone; for 19,965 women aged 50 to 59 years, sensitivity was 83%, 71%, 57%, 83%, and 77% for years 1, 2, 3, 4, and 5 of the trial, respectively; specificity ranged between 88% and 96%. Positive predictive value (PPV), which is the proportion of cancers detected per abnormal examination, was estimated to be 3% to 4%. For 25,620 women aged 40 to 49 years who were examined only at entry, the estimated sensitivity was 71%, specificity was 84%, and PPV was 1.5%.[3]
In clinical trials involving community clinicians, CBE-type screening had higher specificity (97%–99%) [4] and lower sensitivity (22%–36%) than that experienced by examiners.[5-8] A study of screening in women with a positive family history of breast cancer showed that, after a normal initial evaluation, the patient herself, or her clinician performing a CBE, identified more cancers than did mammography.[9]
Another study examined the usefulness of adding CBE to screening mammography; among 61,688 women older than 40 years and screened by mammography and CBE, sensitivity for mammography was 78%, and combined mammography-CBE sensitivity was 82%. Specificity was lower for women undergoing both screening modalities than it was for women undergoing mammography alone (97% vs. 99%).[10] Another study reported the results of a large cluster randomized controlled trial in India that assessed the efficacy of screening with CBE versus no screening on breast cancer mortality.[11] This trial recruited 151,538 women aged 35 to 64 years with no history of breast cancer. After 20 years of follow-up, there was an overall statistically nonsignificant 15% reduction in breast cancer mortality in the screening with CBE arm versus the control arm, but a post hoc subset analysis demonstrated a statistically significant 30% relative reduction in mortality attributable to screening with CBE for women older than 50 years. However, the results of the subset analysis should be interpreted with caution, as this was a cluster randomized trial with only 20 clusters, which raises concerns about potential imbalances between the control and study arms of the trial. Other international trials of CBE are under way, one in India and one in Egypt.
Monthly BSE has been promoted, but there is no evidence that it reduces breast cancer mortality.[12,13] The only large, randomized clinical trial of BSE assigned 266,064 female Shanghai factory workers to either BSE instruction with reinforcement and encouragement, or instruction on the prevention of lower back pain. Neither group underwent any other breast cancer screening. After 10 to 11 years of follow-up, 135 breast cancer deaths occurred in the instruction group, and 131 cancer deaths occurred in the control group (relative risk [RR], 1.04; 95% CI, 0.82–1.33). Although the number of invasive breast cancers diagnosed in the two groups was about the same, women in the instruction group had more breast biopsies and more benign lesions diagnosed than did women in the control group.[14]
Other research results on BSE come from three trials. First, more than 100,000 Leningrad women were assigned to BSE training or control by cluster randomization; the BSE group training had more breast biopsies without improved breast cancer mortality.[15] Second, in the United Kingdom Trial of Early Detection of Breast Cancer, more than 63,500 women aged 45 to 64 years were invited to educational sessions about BSE. After 10 years of follow-up, breast cancer mortality rates were similar to the rates in centers without organized BSE education (RR, 1.07; 95% CI, 0.93–1.22).[16] Thirdly, in contrast, a case-control study nested within the CNBSS compared self-reported BSE frequency before enrollment with breast cancer mortality. Women who examined their breasts visually, used their finger pads for palpation, and used their three middle fingers had a lower breast cancer mortality rate.[17]
Various methods to analyze breast tissue for malignancy have been proposed to screen for breast cancer, but none have been associated with mortality reduction.
Health Insurance Plan, United States 1963 [1,2]
Malmo, Sweden 1976 [3,4]
Östergötland (County E of Two-County Trial), Sweden 1977 [6-8]
Kopparberg (County W of Two-County Trial), Sweden 1977 [6-8]
Edinburgh, United Kingdom 1976 [10]
The study design and conduct make these results difficult to assess or combine with the results of other trials.
National Breast Screening Study (NBSS)-1, Canada 1980 [11]
NBSS-2, Canada 1980 [15]
Stockholm, Sweden 1981 [16]
Gothenburg, Sweden 1982
AGE Trial [17,18]
The United Kingdom Age Trial, a large RCT, compared the effect of mammographic screening on breast cancer mortality in women invited for annual mammography aged 40 years and older when compared with NHS screening programs that began at age 50 years. The primary end point of the AGE Trial was mortality from breast cancer diagnosed during the intervention period until immediately before participants’ first NHS screening. This trial remains the only trial designed specifically to study the effect of mammographic screening starting at age 40 years and is one of three RCTs, which the Cochrane group’s 2013 meta-analysis deemed adequately randomized.
In 2006, the AGE Trial published results of breast cancer mortality at a mean follow-up at 10.7 years: a reduction in breast cancer mortality in the intervention group, which did not reach statistical significance (105 breast cancer deaths in intervention group vs. 251 breast cancer death in control group).
In 2015, the AGE Trial published results of breast cancer mortality at a median follow-up of 17.7 years: no statistically significant reduction after more than 10 years of follow-up and no statistically significant decrease in all-cause mortality. At this time, it also published results of a reanalysis of the original data set: a small, transient, statistically significant reduction in breast cancer mortality in the intervention group during the first 10 years after randomization (83 breast cancer deaths in intervention group vs. 219 breast cancer death in control group).
In 2020, the AGE Trial published final results based on median follow-up of 22.9 years including:
This evidence is inadequate to support the conclusion of a clinically significant breast cancer mortality reduction attributable to initiation of screening mammography among women aged 39 to 49 years. The reported mortality reduction is a small, transient reduction in breast cancer mortality based on post hoc, subset analysis, nonstandard imaging protocol, and nonstandard threshold for biopsy (microcalcifications were not biopsied). In absolute terms, the difference in breast cancer mortality was -0.6 deaths per 1,667 women in the 40 to 49 years age group based on a reanalysis of the original data set, which was not statistically significant, and the recalculation of breast cancer mortality in a subgroup restricted to 10 years of follow-up. At a median follow-up of 22.9 years, there was no statistically significant decrease in risk of breast cancer or all-cause mortality.[18]
This evidence is inadequate to make a clear determination of the magnitude of overdiagnosis. Because the evidence is based on subgroup analysis and nonstandard imaging schedule, nonstandard imaging protocol, and a nonstandard threshold for biopsy (microcalcifications were not biopsied) with uncertain relevance to the general population, it does not support the investigators' conclusion of “at worst a small amount of overdiagnosis."[18]
The PDQ cancer information summaries are reviewed regularly and updated as new information becomes available. This section describes the latest changes made to this summary as of the date above.
Updated statistics with estimated new cases and deaths for 2024 (cited American Cancer Society as reference 1).
Added text about a prospective multicenter study, known as the Dense Breast Tomosynthesis Ultrasound Screening Trial or DBTUST, that investigated whether ultrasound improved cancer detection after digital breast tomosynthesis in women with dense breasts (cited Berg et al. as reference 69). The study concluded that technologist-performed ultrasound screening modestly improved detection of cancer and also increased the false-positive recall rate in women with dense breasts.
This summary is written and maintained by the PDQ Screening and Prevention Editorial Board, which is editorially independent of NCI. The summary reflects an independent review of the literature and does not represent a policy statement of NCI or NIH. More information about summary policies and the role of the PDQ Editorial Boards in maintaining the PDQ summaries can be found on the About This PDQ Summary and PDQ® Cancer Information for Health Professionals pages.
This PDQ cancer information summary for health professionals provides comprehensive, peer-reviewed, evidence-based information about breast cancer screening. It is intended as a resource to inform and assist clinicians in the care of their patients. It does not provide formal guidelines or recommendations for making health care decisions.
This summary is reviewed regularly and updated as necessary by the PDQ Screening and Prevention Editorial Board, which is editorially independent of the National Cancer Institute (NCI). The summary reflects an independent review of the literature and does not represent a policy statement of NCI or the National Institutes of Health (NIH).
Board members review recently published articles each month to determine whether an article should:
Changes to the summaries are made through a consensus process in which Board members evaluate the strength of the evidence in the published articles and determine how the article should be included in the summary.
Any comments or questions about the summary content should be submitted to Cancer.gov through the NCI website's Email Us. Do not contact the individual Board Members with questions or comments about the summaries. Board members will not respond to individual inquiries.
Some of the reference citations in this summary are accompanied by a level-of-evidence designation. These designations are intended to help readers assess the strength of the evidence supporting the use of specific interventions or approaches. The PDQ Screening and Prevention Editorial Board uses a formal evidence ranking system in developing its level-of-evidence designations.
PDQ is a registered trademark. Although the content of PDQ documents can be used freely as text, it cannot be identified as an NCI PDQ cancer information summary unless it is presented in its entirety and is regularly updated. However, an author would be permitted to write a sentence such as “NCI’s PDQ cancer information summary about breast cancer prevention states the risks succinctly: [include excerpt from the summary].”
The preferred citation for this PDQ summary is:
PDQ® Screening and Prevention Editorial Board. PDQ Breast Cancer Screening. Bethesda, MD: National Cancer Institute. Updated . Available at: https://www.cancer.gov/types/breast/hp/breast-screening-pdq. Accessed . [PMID: 26389344]
Images in this summary are used with permission of the author(s), artist, and/or publisher for use within the PDQ summaries only. Permission to use images outside the context of PDQ information must be obtained from the owner(s) and cannot be granted by the National Cancer Institute. Information about using the illustrations in this summary, along with many other cancer-related images, is available in Visuals Online, a collection of over 2,000 scientific images.
The information in these summaries should not be used as a basis for insurance reimbursement determinations. More information on insurance coverage is available on Cancer.gov on the Managing Cancer Care page.
More information about contacting us or receiving help with the Cancer.gov website can be found on our Contact Us for Help page. Questions can also be submitted to Cancer.gov through the website’s Email Us.
If you would like to reproduce some or all of this content, see Reuse of NCI Information for guidance about copyright and permissions. In the case of permitted digital reproduction, please credit the National Cancer Institute as the source and link to the original NCI product using the original product's title; e.g., “Breast Cancer Screening (PDQ®)–Health Professional Version was originally published by the National Cancer Institute.”
Want to use this content on your website or other digital platform? Our syndication services page shows you how.