Phylogenomic analysis of Xanthomonas

David J Studholme

Published: 2022-08-06 DOI: 10.17504/protocols.io.261geny57g47/v1

Disclaimer

DISCLAIMER – FOR INFORMATIONAL PURPOSES ONLY; USE AT YOUR OWN RISK

The protocol content here is for informational purposes only and does not constitute legal, medical, clinical, or safety advice, or otherwise; content added to protocols.io is not peer reviewed and may not have undergone a formal approval of any kind. Information presented in this protocol should not substitute for independent professional judgment, advice, diagnosis, or treatment. Any action you take or refrain from taking using or relying upon the information presented here is strictly at your own risk. You agree that neither the Company nor any of the authors, contributors, administrators, or anyone else associated with protocols.io, can be held responsible for your use of the information contained in or linked to this protocol or any of our Sites/Apps and Services.

Abstract

This is a protocol for using PhaME to generate a phylogenomic tree from a set of Xanthomonas spp. genome sequences.

Steps

1.

Create a directory for downloaded genome sequence data:

#Create directory (Ubuntu 22.04)
mkdir genomes
2.

Enter the directory for downloaded genome sequence data:

#Enter directory (Ubuntu 22.04)
cd genomes
3.

Ensure that NCBI Datasets command line tools are installed and executable (or a symbolic link to the executable) is in the current directory.

Software

ValueLabel
NCBI Datasets command line toolsNAME
NCBIDEVELOPER
https://www.ncbi.nlm.nih.gov/datasets/docs/v1/download-and-install/LINK
4.

Download the genome assemblies that will be included in the analysis:

#Download assemblies from NCBI (Ubuntu 22.04 LTS)
./datasets download genome accession --inputfile xanthomonas_assm_accs.txt  --exclude-gff3 --exclude-protein --exclude-rna --exclude-genomic-cds --filename xanthomonas_genome_assemblies.zip

Citation
You should receive a message something like: "Downloading: xanthomonas_genome_assemblies.zip197MB done"

#Unzip the assemblies download (Ubuntu 22.04 LTS)
unzip xanthomonas_genome_assemblies.zip
#Make symbolic links to the downloaded assemblies (Ubuntu 22.04 LTS)
ln -s ncbi_dataset/data/GCA_*/GCA_*.fna .
#List the symbolic links to assembly sequence files (Ubuntu 22.04 LTS)
ls *.fna
5.

Rename the symbolic links to more informative names. We will use the rename_files.pl script to effect this.

Software

ValueLabel
rename_files.plNAME
https://github.com/davidjstudholme/phylogenomics-Xanthomonas/blob/main/rename_files.plREPOSITORY

perl rename_files.pl  genomes.txt

Citation
This should generate a set of .fasta files and .contig symbolic links with informative filenames.

6.

Come back out of the directory for downloaded genome sequence data:

#Change to previous directory (Ubuntu LTS 22.04)
cd -
7.

Set-up the reference genome sequence data

#Create reference directory (Ubuntu 22.04 LTS)
mkdir ref
#Enter the reference directory (Ubuntu 22.04 LTS)
cd ref
#Make symbolic link to reference genome assembly (Ubuntu 22.04 LTS)
ln -s ../genomes/X._campestris_pv._campestris_ATCC_33913_T.fasta .
#Change back to the root directory (Ubuntu 22.04 LTS)
cd -
8.

Set-up the working directory.

#Create working directory (Ubuntu 20.04 LTS)
mkdir workdir
cd workdir
#Make symbolic links to all the genome assemblies (Ubuntu 22.04 LTS)
ln -s ../genomes/*.contig .
```Optionally, at this point, we can delete the symbolic links for any genomes that we want to exclude from the final analysis. It is also an option to add any genome assemblies as .contig files (inf FASTA format).




#Return to previous directory (Ubuntu 22.04 LTS) cd -




9.

Install PhaME software into a Conda environment called 'phame', following instructions on the software's GitHub page:

Software

ValueLabel
PhaMENAME
https://github.com/LANL-Bioinformatics/PhaMEREPOSITORY

The PhaME software is described in this paper:

Citation
Shakya M, Ahmed SA, Davenport KW, Flynn MC, Lo CC, Chain PSG 2020 Standardized phylogenetic and molecular evolutionary analysis applied to species across the microbial tree of life. Scientific reports https://doi.org/10.1038/s41598-020-58356-1

10.
#Activate Phame Conda environment (Ubuntu 22.04 LTS)
conda activate phame
12.

Execute PhaME:

#Execute PhaME (Ubuntu 22.04 LTS)
phame ./phame.ctl

Citation
This will generate output, including tree files, in directory:./workdir/results/trees/

13.

The tree file can now be visualised using any tree-viewing software, for example, iTOL.

Citation
Letunic I, Bork P 2021 Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic acids research https://doi.org/10.1093/nar/gkab301

推荐阅读

Nature Protocols
Protocols IO
Current Protocols
扫码咨询