Are There 200,000 "Duplicate" Articles in Journals Indexed by Medline?
Based on a recent study published in Nature, it is possible that there may be as many as 200,000 duplicate articles (either articles that were published in multiple journals or plagiarized) in journals indexed by Medline. To conduct the study, Mounir Errami and Harold Garner utilized the eTBLAST software to analyze samples of Medline article abstracts in order to estimate the prevalence of duplicate articles.
Duplicate detection is an issue of great concern to both publishers and scholars. The CrossCheck project is allowing eight publishers to test the duplicate checking as part of the editorial process in a closed-access environment. In the project's home page, it states:
Currently, existing PD [plagiarism detection] systems do not index the majority of scholarly/professional content because it is inaccessible to crawlers directed at the open web. The only scholarly literature that is currently indexed by PD systems is that which is available openly (e.g. OA, Archived or illegitimately posted copies) or that which has been made available via third-party aggregators (e.g. ProQuest). This, in turn, means that any publisher who is interested in employing PD systems in their editorial work-flow is unable to do so effectively. Even if a particular publisher doesn't have a problem with plagiarized manuscripts, they should have an interest in making sure that their own published content is not plagiarized or otherwise illegitimately copied.
In order for CrossRef members to use existing PD systems, there needs to be a mechanism through which PD system vendors can, under acceptable terms & conditions, create and use databases of relevant scholarly and professional content.
Open access advocates have pointed out that one advantage of OA is that it allows the unrestricted analysis and manipulation of the full text of freely available works. Open access makes it possible for all interested parties, including scholars and others who might not have access to closed duplicate verification databases, to conduct whatever analysis as they wish and to make the results public without having to consider potential business impacts.
Read more about it at: "Copycat Articles Seem Rife in Science Journals, a Digital Sleuth Finds" and "How Many Papers Are Just Duplicates?"
Latest posts in Open Access
- A Look at the Development and Future of Scholarly Communication in High Energy Physics - August 6th, 2008
- Microsoft's Free Digital Tools for Scholars - July 28th, 2008
- NIH Mandate Works: Article Deposits in PubMed Central Dramatically Increase - July 24th, 2008
Latest posts in Public Domain
- TRLN (Triangle Research Libraries Network) Members Join the Open Content Alliance - February 20th, 2008
- Commons-Research Mailing List Launched - February 15th, 2008
- Columbia University and Microsoft Book Digitization Project - January 29th, 2008
Latest posts in Scholarly Communication
- A Look at the Development and Future of Scholarly Communication in High Energy Physics - August 6th, 2008
- Scholarly Electronic Publishing Weblog Update (8/5/08) - August 5th, 2008
- ARL Revamps Scholarly Communication Resources Web Site - July 29th, 2008
Latest posts in Scholarly Journals
- Information Technology and Libraries Launches ITALica Weblog - August 8th, 2008
- RoMEO: Now with 400+ Publisher Self-Archiving Policies - July 22nd, 2008
- Digital Will Be Default Format for Astrophysical Journal Letters in 2009 - July 22nd, 2008





























