Diagnostic accuracy statistics, including predictive values, risk-differences, Youden's index and Area Under the Curve (AUC), assess the promise of novel biomarkers proposed as diagnostic tests. We reinterpret these statistics in light of risk-stratification (how well a biomarker separates those at higher risk from those at lower risk) to better understand their implications for public-health programs. We introduce an intuitively simple statistic, Mean Risk Stratification (MRS): the average change in risk (pre-test vs. post-test) revealed for tested individuals. High MRS implies better risk separation achieved by testing. MRS demonstrates that conventional predictive values can mislead because they do not account for disease prevalence or test-positivity rates. Little risk-stratification is possible for rare diseases, demonstrating a "high-bar" to justify population-based screening. Importantly, we demonstrate that the risk-difference, Youden's index, and AUC measure only multiplicative relative gains in risk-stratification: AUC=0.6 achieves only 20% of maximum risk-stratification (AUC=0.9 achieves 80%). However, large relative gains in risk-stratification might not imply large absolute gains if disease is rare or if the test is rarely positive. We illustrate MRS by our experience comparing the performance of cervical cancer screening tests in China vs. the USA. The test with the worst AUC=0.72 in China (visual inspection with ascetic acid) provides twice the risk-stratification of the test with best AUC=0.83 in the USA (human papillomavirus and Pap cotesting) because China has three times more cervical precancer/cancer. MRS could be routinely calculated to better understand the clinical/public-health implications of standard diagnostic accuracy statistics.