Guides

Filtering SNP alignments with SKA

SKA is a tool for comparing small and highly similar genomes using split k-mers. This guide explains how to create a SNP alignment from a skf file using the different options implemented in the command ska align (here using ska v0.3.5). Recommended command line For those in a hurry, the recommended command line for filtering for precise SNP calling is: ska align --no-gap-only-sites --filter-ambig-as-missing --filter no-ambig-or-const Breakdown using an example In this example, let’s consider a skf file generated using k=51 from Illumina sequencing reads of 45 Mycobacterium tuberculosis isolates collected by the UKHSA, and belonging to the same transmission cluster (i.

Guides

Building trees with SKA

SKA is a tool for comparing small and highly similar genomes using split k-mers. This guide will explain how to use SKA to build a phylogenetic tree for different Escherichia coli lineages in a few minutes. Although SKA is tailored more towards analysing variation within a lineage, tree-building ends up working fine for the whole species but requires more memory. Why SKA is good for building phylogenetic trees The basic approach to building a tree with SKA is to generate a SNP alignment using split k-mers and then feed that to a tree building algorithm of choice.

October 18, 2022

A beginner's guide to fitting PopPUNK models

PopPUNK now has a lot of different models available, which can make it hard to know where to start, or to tell if you’ve done the right thing when fitting one to your data. Some questions I’ll address: Do you need to fit a new model? Should I subsample my data? Which model should I use? Are my clusters correct? My clusters don’t match MLST/CC tl;dr Don’t fit a model if you don’t have to.