Abstract:
To select suitable population genetic variation calling tools for various datasets, different software tools (samtools, gatk, freebayes and sambamba) were compared. The variations were extracted from resequencing datasets, including three species (
Arabidopsis, rice and human) with different genome sizes and tobacco linkage group 1, by different tools. The comparison results of single-sample and multiple-sample data showed that samtools and sambamba tended to produce as much as variations, whereas outputs from gatk and freebayes tended to contain higher accuracy variations. Sambamba was much faster than the other tools, and gatk had some advantages in speed for multiple-sample data analysis. Gatk consumed much more computing memory than the other tools.