PCR duplicates info

Evaluating the necessity of PCR duplicate removal from next-generation sequencing
data and a comparison of approaches

Authors: Mark Ebbert*, Mark Wadsworth*, Lyndsay Staley*, Kaitlyn Hoyt, Brandon Pickett, Justin Miller, John Duce, for the Alzheimer's Disease Neuroimaging Initiative, John SK Kauwe, Perry Ridge

BMC Bioinformatics
Article
20 Citations
July, 2016
Above Average Altmetric Score of 3
Tweeted by 5 people
130 Mendeley Readers
Novelty of Approach

-Evaluated PCR duplicate removal on final genome assembly
-Compared CHIP seq data with whole genome sequencing (WGS) data
-Performed depth of coverage analysis on WGS data
Results

-92 % of the 17+ million variants called were called whether we removed duplicates with Picard or SAMTools, or left the PCR duplicates in the dataset.
-No significant differences between the unique variant sets
Implications

-Removing PCR duplicates is unnecessary
-Save compute and analysis time by not removing PCR duplicates