NCGC participates in ICGC mutation calling benchmark study
Published in Nature Communications
There are numerous variations to the algorithms used for detection of somatic mutations in cancer based on next generation sequencing, and these have a profound effect on what mutations are detected and which artefacts appear. The International Cancer Genomics Consortium, including the NCGC Bioinformatics team, has benchmarked the procedure, and determined an optimized consensus pipeline which is now used for the NCGC studies. This is one of the central objectives of the NCGC platform, - to establish common, reliable procedures for somatic mutation scoring in patient samples, and lay the groundwork for standardized nation-wide diagnostic procedures in the Norwegian health service.
The study was initiated by a comparison performed by ICGC by letting 14 of the best bioinformatics teams identify mutations from the same whole genome tumour/normal data set using their best algorithms.
The figure shows the fractions of point mutation calls that were common among various numbers of groups, whereas larger mutations (indels) varied even more.
These data sets, and those in the published study, are much larger than those produced by NCGC, where only the genes (the exome, 60 mbp) are sequenced. However, the principles for mutation calling are quite the same. Although whole genome data are regarded as more even than exome data, the study determined that both blood and tumour should be sequenced to coverage of 100x to score most mutations, whereas even deeper sequencing might be neccessary to obtain also mutations in smaller subclones of cancer cells.