Abstract
Wikipedia is a widely used online reference work which cites hundreds of thousands of scientific articles across its entries. The quality of these citations has not been previously measured, and such measurements have a bearing on the reliability and quality of the scientific portions of this reference work. Using a novel technique, a massive database of qualitatively described citations, and machine learning algorithms, we analyzed 1,923,575 Wikipedia articles which cited a total of 841,821 scientific articles, and found that most cited articles (58%) are uncited or untested by subsequent studies, while the remainder show a wide variability in contradicting or supporting evidence (2-40%).
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.