De-novo assembly of Xanthomonas genomes from Illumina NovaSeq reads
David J Studholme, Jamie Harrison
Disclaimer
DISCLAIMER – FOR INFORMATIONAL PURPOSES ONLY; USE AT YOUR OWN RISK
The protocol content here is for informational purposes only and does not constitute legal, medical, clinical, or safety advice, or otherwise; content added to protocols.io is not peer reviewed and may not have undergone a formal approval of any kind. Information presented in this protocol should not substitute for independent professional judgment, advice, diagnosis, or treatment. Any action you take or refrain from taking using or relying upon the information presented here is strictly at your own risk. You agree that neither the Company nor any of the authors, contributors, administrators, or anyone else associated with protocols.io, can be held responsible for your use of the information contained in or linked to this protocol or any of our Sites/Apps and Services.
Abstract
This protocol describes the de-novo assembly of Xanthomonas genome sequences from short-read genomic shotgun sequencing data. It includes quality control of the raw sequence reads, assembly and finally polishing of the assembly based on alignment of reads against the preliminary assembly.
Steps
Software pre-requisites.
Perform quality-based filtering and adapter trimming using fastp.
mkdir name_fastp_out
fastp -i name_r1.fq.gz -I name_r2.fq.gz -o name_trimmed_r1.fq.gz -O name_trimmed_r2.fq.gz --unpaired1 name_trimmed_unp.fq.gz --unpaired2 name_trimmed_unp.fq.gz -r --cut_right_window_size 5 --cut_right_mean_quality 20 -c -l 50 -j name_fastp_out/name_fastp_report.json -h name_fastp_out/name_fastp_report.html
Perform de-novo assembly using SPAdes.
spades.py -1 name_trimmed_r1.fq.gz -2 name_trimmed_r2.fq.gz -s name_trimmed_unp.fq.gz --careful --cov-cutoff auto -o name_spades_out
Polishing with Pilon
bowtie2-build name.fasta name
bowtie2 -x name -1 name_trimmed_r1.fq.gz -2 name_trimmed_r2.fq.gz -S name_vs_name.sam
samtools view -b -T name.fasta name_vs_name.sam -o name_vs_name.sam.bam
samtools sort --reference name.fasta name_vs_name.sam.bam -o name_vs_name.sam.bam.sorted.bam
samtools index name_vs_name.sam.bam.sorted.bam
rm name_vs_name.sam.bam $name_vs_$name.sam
pilon --genome name.fasta --frags name_vs_name.sam.bam.sorted.bam --output name.pilon --outdir name_pilon_out
Bibliography