We provide a reproducible and scalable Snakemake workflow, called
RNA-Seq-Pop, which provides end-to-end analysis of
RNA sequencing data sets. The workflow allows the user to perform quality control, perform differential expression analyses and call genomic variants. Additional options include the calculation of allele frequencies of variants of interest, summaries of genetic variation and population structure, and genome-wide selection scans, together with clear visualizations.
RNA-Seq-Pop is applicable to any organism, and we demonstrate the utility of the workflow by investigating
pyrethroid resistance in selected strains of the major
malaria mosquito, Anopheles gambiae. The workflow provides additional modules specifically for An. gambiae, including estimating recent ancestry and determining the karyotype of common
chromosomal inversions. The Busia laboratory colony used for selections was collected in Busia, Uganda, in November 2018. We performed a comparative analysis of three groups: a parental G24 Busia strain; its
deltamethrin-selected G28 offspring; and the susceptible reference strain Kisumu. Measures of genetic diversity reveal patterns consistent with that of laboratory colonization and selection, with the parental Busia strain exhibiting the highest
nucleotide diversity, followed by the selected Busia offspring, and finally, Kisumu. Differential expression and variant analyses reveal that the selected Busia colony exhibits a number of distinct mechanisms of
pyrethroid resistance, including the Vgsc-995S target-site mutation, upregulation of SAP genes, P450s and a cluster of
carboxylesterases. During
deltamethrin selections, the 2La
chromosomal inversion rose in frequency (from 33% to 86%), supporting a previous link with
pyrethroid resistance.
RNA-Seq-Pop is hosted at: github.com/sanjaynagi/
rna-seq-pop. We anticipate that the workflow will provide a useful tool to facilitate reproducible, transcriptomic studies in An. gambiae and other taxa.