Phylogenomic analysis of Xanthomonas
David J Studholme
Disclaimer
DISCLAIMER – FOR INFORMATIONAL PURPOSES ONLY; USE AT YOUR OWN RISK
The protocol content here is for informational purposes only and does not constitute legal, medical, clinical, or safety advice, or otherwise; content added to protocols.io is not peer reviewed and may not have undergone a formal approval of any kind. Information presented in this protocol should not substitute for independent professional judgment, advice, diagnosis, or treatment. Any action you take or refrain from taking using or relying upon the information presented here is strictly at your own risk. You agree that neither the Company nor any of the authors, contributors, administrators, or anyone else associated with protocols.io, can be held responsible for your use of the information contained in or linked to this protocol or any of our Sites/Apps and Services.
Abstract
This is a protocol for using PhaME to generate a phylogenomic tree from a set of Xanthomonas spp. genome sequences.
Steps
Create a directory for downloaded genome sequence data:
#Create directory (Ubuntu 22.04)
mkdir genomes
Enter the directory for downloaded genome sequence data:
#Enter directory (Ubuntu 22.04)
cd genomes
Ensure that NCBI Datasets command line tools are installed and executable (or a symbolic link to the executable) is in the current directory.
Software
Value | Label |
---|---|
NCBI Datasets command line tools | NAME |
NCBI | DEVELOPER |
https://www.ncbi.nlm.nih.gov/datasets/docs/v1/download-and-install/ | LINK |
Download the genome assemblies that will be included in the analysis:
#Download assemblies from NCBI (Ubuntu 22.04 LTS)
./datasets download genome accession --inputfile xanthomonas_assm_accs.txt --exclude-gff3 --exclude-protein --exclude-rna --exclude-genomic-cds --filename xanthomonas_genome_assemblies.zip
#Unzip the assemblies download (Ubuntu 22.04 LTS)
unzip xanthomonas_genome_assemblies.zip
#Make symbolic links to the downloaded assemblies (Ubuntu 22.04 LTS)
ln -s ncbi_dataset/data/GCA_*/GCA_*.fna .
#List the symbolic links to assembly sequence files (Ubuntu 22.04 LTS)
ls *.fna
Rename the symbolic links to more informative names. We will use the rename_files.pl script to effect this.
Software
Value | Label |
---|---|
rename_files.pl | NAME |
https://github.com/davidjstudholme/phylogenomics-Xanthomonas/blob/main/rename_files.pl | REPOSITORY |
perl rename_files.pl genomes.txt
Come back out of the directory for downloaded genome sequence data:
#Change to previous directory (Ubuntu LTS 22.04)
cd -
Set-up the reference genome sequence data
#Create reference directory (Ubuntu 22.04 LTS)
mkdir ref
#Enter the reference directory (Ubuntu 22.04 LTS)
cd ref
#Make symbolic link to reference genome assembly (Ubuntu 22.04 LTS)
ln -s ../genomes/X._campestris_pv._campestris_ATCC_33913_T.fasta .
#Change back to the root directory (Ubuntu 22.04 LTS)
cd -
Set-up the working directory.
#Create working directory (Ubuntu 20.04 LTS)
mkdir workdir
cd workdir
#Make symbolic links to all the genome assemblies (Ubuntu 22.04 LTS)
ln -s ../genomes/*.contig .
```Optionally, at this point, we can delete the symbolic links for any genomes that we want to exclude from the final analysis. It is also an option to add any genome assemblies as .contig files (inf FASTA format).
#Return to previous directory (Ubuntu 22.04 LTS) cd -
Install PhaME software into a Conda environment called 'phame', following instructions on the software's GitHub page:
Software
Value | Label |
---|---|
PhaME | NAME |
https://github.com/LANL-Bioinformatics/PhaME | REPOSITORY |
The PhaME software is described in this paper:
#Activate Phame Conda environment (Ubuntu 22.04 LTS)
conda activate phame
Create file phame.ctl in current directory.
Execute PhaME:
#Execute PhaME (Ubuntu 22.04 LTS)
phame ./phame.ctl
The tree file can now be visualised using any tree-viewing software, for example, iTOL.