SubCellBarCode: integrated workflow for robust spatial proteomics by mass spectrometry

Taner Arslan, Yanbo Pan, Georgios Mermelekas, Mattias Vesterlund, Lukas M. Orre, Janne Lehtiö

Published: 2022-06-21 DOI: 10.1038/s41596-022-00699-2

Extended

Extended Data Fig. 1 Samples generated during cell fractionation.

a , Duplicate dishes of cells were incubated with digitonin (42 μg/ml) on a tilt rocker at 4 °C for 7 min. b , After incubation, the digitonin solution was recovered as fraction FS1. c , Exaggerated digitonin treatment (>10 min) will disrupt chromatin, resulting in DNA ‘threads’ in the dish upon harvesting of cells. d , Zoom in on DNA ‘threads’. e , After harvesting by scraping into low-salt buffer, the cells are transferred into a Dounce homogenizer. Mechanical disruption of cells is accomplished by repeated strokes into the homogenizer by a tight-fitting glass pestle. One stroke equals one down-and-up movement of the pestle. The strokes should be performed below the liquid level to avoid foaming. f , Examples of FP1 pellets generated by low-speed centrifugation. g , Examples of FP2 pellets generated by medium-speed centrifugation. For some cell lines, FP2 may be difficult to see by the naked eye. h , Examples of soluble fractions (FS2) and pellets (FP3) generated by ultracentrifugation at 100,000 g for 1 h. sup., supernatant.

Extended Data Fig. 2 MS data comparison of MS approaches.

a , Cumulative distribution plots showing the number of PSMs used for protein quantification for different MS approaches. Indicated in the plots is the percentage of quantifications that were based on at least three PSMs. b , Scatter plots displaying the number of PSMs used for quantification and fractionation profile correlation between replicates. c , Bar plots showing the number of proteins identified (left, 1% FDR, gene centric), the number of unique peptides (middle, 1% FDR) and the number of PSMs used for quantification (right) for the three different MS approaches used. corr., correlation.

Source data

Extended Data Fig. 3 Classification output and MS method comparison.

a , Bar plots indicating the number of compartment classifications for the three different MS approaches used. b , Box plots displaying the classification probabilities for compartment-level classifications for the three different MS methods used. c , Box plots showing the minimum number of PSMs used for quantification of proteins on the basis of neighborhood classification agreement between HiRIEF and high-pH strategies (left) or HiRIEF and long-gradient strategies (right). Wilcoxon signed-rank test (two-sided) was used to calculate P values. d , Bar plots indicating the neighborhood classification agreement between methods for proteins quantified with one to two PSMs or more than two PSMs. e , Bar plots showing number of classifications and classification frequency of proteins binned by quantitative range (maximum value through minimum value in fractionation profile) for different MS approaches. Quantitative data are binned into five portions ranging between 0–0.5, 0.5–1, 1–1.5, 1.5–2 and >2. The elements of the boxplots in the figure are as follows: center line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; points, outliers.

Source data

Extended Data Fig. 4 Comparison of classification output for different MS approaches.

a , Density plots showing the minimum number of PSMs used for quantifications of proteins. Density plots are colored on the basis of agreement between HiRIEF and high-pH MS strategies: yellow if protein classification is the same in both methods, red if it is different, blue if the protein is unclassified in one of the methods and green if unclassified in both methods. b , Scatter plot showing the maximum neighborhood classification probability score of HiRIEF and high-pH methods and the minimum number of PSMs used for quantification between HiRIEF and high-pH methods. Proteins are colored on the basis of classification agreement as described above ( a ). c , SubCellBarCodes for APC protein from the original five cell lines analyzed and available in the SubCellBarCode.org resource. The left figure indicates neighborhood-level classifications, and the right figure indicates compartment-level classifications. d , Scatter plots showing classification probability of proteins for HiRIEF and long-gradient MS strategies (left) and high-pH and long-gradient MS strategies (right). Proteins are colored on the basis of classification agreement: yellow indicates the same classification in both methods, red indicates a different classification, blue indicates proteins that were unclassified in one of the methods and green indicates proteins that were unclassified in both methods. class., classified; grad., gradient; Unclass., unclassified.

Source data

Supplementary information

Supplementary Information

Supplementary Method 1. BioConductor vignette for the SubCellBarCode R package

Reporting Summary

Supplementary Table 1

LC gradient lengths and strategy used for LC-MS analysis of the individual fractions (column A) generated by HiRIEF pre-fractionation in the pH ranges 3–10 (column B) and 3.4–4.8 (column C)

Supplementary Table 2

Relative quantitative data (TMT ratios) for the different MS datasets used in the current study. Data are represented in five different sheets: combined HiRIEF 3–10/3.4–4.8, HiRIEF 3–10, HiRIEF 3.4–4.8, high pH and long gradient. Each sheet includes the following columns: column A—gene symbol–centric protein ID, column B–L—TMT ratios for the five fractions (FS1, FS2 and FP1–3) in duplicate (A and B) and column M—minimum number of PSMs used for quantification for any of the 10 TMT channels

Supplementary Table 3

SubCellBarCode classification output for the different MS datasets used in the current study. Data are represented in five different sheets; combined HiRIEF 3–10/3.4–4.8, HiRIEF 3–10, HiRIEF 3.4–4.8, high pH and long gradient. Each sheet includes the following columns: column A—gene symbol–centric protein ID, column B—final neighborhood classification (SVMoutput), column C—final compartment classification (SVMoutput), columns D–G—SVM-derived probabilities for the individual neighborhoods and columns H–V—SVM-derived probabilities for the individual compartments

Improved high-molecular-weight DNA extraction, nanopore sequencing and metagenomic assembly from the human gut microbiome

Multiplexed single-cell analysis of organoid signaling networks

查看全部

Sections

Figures

References

Extended
Supplementary information

Heald, R. & Cohen-Fix, O. Morphology and function of membrane-bound organelles. Curr. Opin. Cell Biol. 26, 79–86 (2014).
Bauer, N. C., Doetsch, P. W. & Corbett, A. H. Mechanisms regulating protein localization. Traffic 16, 1039–1061 (2015).
Wang, A. J., Han, Y., Jia, N., Chen, P. & Minden, M. D. NPM1c impedes CTCF functions through cytoplasmic mislocalization in acute myeloid leukemia. Leukemia 34, 1278–1290 (2020).
Dansen, T. B. & Burgering, B. M. Unravelling the tumor-suppressive functions of FOXO proteins. Trends Cell Biol. 18, 421–429 (2008).
Guardia, C. M., De Pace, R., Mattera, R. & Bonifacino, J. S. Neuronal functions of adaptor complexes involved in protein sorting. Curr. Opin. Neurobiol. 51, 103–110 (2018).
De Matteis, M. A. & Luini, A. Mendelian disorders of membrane trafficking. N. Engl. J. Med. 365, 927–938 (2011).
Thul, P. J. et al. A subcellular map of the human proteome. Science 356, eaal3321 (2017).
Schnell, U., Dijk, F., Sjollema, K. A. & Giepmans, B. N. Immunolabeling artifacts and the need for live-cell imaging. Nat. Methods 9, 152–158 (2012).
Stadler, C. et al. Immunofluorescence and fluorescent-protein tagging show high correlation for protein localization in mammalian cells. Nat. Methods 10, 315–323 (2013).
Andersen, J. S. et al. Proteomic characterization of the human centrosome by protein correlation profiling. Nature 426, 570–574 (2003).
Foster, L. J. et al. A mammalian organelle map by protein correlation profiling. Cell 125, 187–199 (2006).
Liu, X., Salokas, K., Weldatsadik, R. G., Gawriyski, L. & Varjosalo, M. Combined proximity labeling and affinity purification-mass spectrometry workflow for mapping and visualizing protein interaction networks. Nat. Protoc. 15, 3182–3211 (2020).
Roux, K. J., Kim, D. I., Raida, M. & Burke, B. A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. J. Cell Biol. 196, 801–810 (2012).
Gatto, L., Breckels, L. M., Wieczorek, S., Burger, T. & Lilley, K. S. Mass-spectrometry-based spatial proteomics data analysis using pRoloc and pRolocdata. Bioinformatics 30, 1322–1324 (2014).
Christoforou, A. et al. A draft map of the mouse pluripotent stem cell spatial proteome. Nat. Commun. 7, 8992 (2016).
Itzhak, D. N., Tyanova, S., Cox, J. & Borner, G. H. Global, quantitative and dynamic mapping of protein subcellular localization. eLife 5, e16950 (2016).
Itzhak, D. N. et al. A mass spectrometry-based approach for mapping protein subcellular localization reveals the spatial proteome of mouse primary neurons. Cell Rep. 20, 2706–2718 (2017).
Geladaki, A. et al. Combining LOPIT with differential ultracentrifugation for high-resolution spatial proteomics. Nat. Commun. 10, 331 (2019).
Orre, L. M. et al. SubCellBarCode: proteome-wide mapping of protein localization and relocalization. Mol. Cell 73, 166–182.e7 (2019).
Joshi, R. N. et al. TcellSubC: an atlas of the subcellular proteome of human T cells. Front. Immunol. 10, 2708 (2019).
Stenström, L. et al. Mapping the nucleolar proteome reveals a spatiotemporal organization related to intrinsic protein disorder. Mol. Syst. Biol. 16, e9469 (2020).
Herr, P. et al. Cell cycle profiling reveals protein oscillation, phosphorylation, and localization dynamics. Mol. Cell. Proteom. 19, 608–623 (2020).
Moll, T., Tebb, G., Surana, U., Robitsch, H. & Nasmyth, K. The role of phosphorylation and the CDC28 protein kinase in cell cycle-regulated nuclear import of the S. cerevisiae transcription factor SWI5. Cell 66, 743–758 (1991).
Du, J. X., Bialkowska, A. B., McConnell, B. B. & Yang, V. W. SUMOylation regulates nuclear localization of Kruppel-like factor 5. J. Biol. Chem. 283, 31991–32002 (2008).
Wang, M. & Casey, P. J. Protein prenylation: unique fats make their mark on biology. Nat. Rev. Mol. Cell Biol. 17, 110–122 (2016).
Mertins, P. et al. Reproducible workflow for multiplexed deep-scale proteome and phosphoproteome analysis of tumor tissues by liquid chromatography–mass spectrometry. Nat. Protoc. 13, 1632–1661 (2018).
Christopher, J. A. et al. Subcellular proteomics. Nat. Rev. Methods Prim. 1, 32 (2021).
Lundberg, E. & Borner, G. H. H. Spatial proteomics: a powerful discovery tool for cell biology. Nat. Rev. Mol. Cell Biol. 20, 285–302 (2019).
Crook, O. M., Mulvey, C. M., Kirk, P. D. W., Lilley, K. S. & Gatto, L. A Bayesian mixture modelling approach for spatial proteomics. PLoS Comput. Biol. 14, e1006516 (2018).
Lee, S. Y. et al. APEX fingerprinting reveals the subcellular localization of proteins of interest. Cell Rep. 15, 1837–1847 (2016).
Liu, X. et al. An AP-MS- and BioID-compatible MAC-tag enables comprehensive mapping of protein interactions and subcellular localizations. Nat. Commun. 9, 1188 (2018).
Go, C. D. et al. A proximity-dependent biotinylation map of a human cell. Nature 595, 120–124 (2021).
De Duve, C., Pressman, B. C., Gianetto, R., Wattiaux, R. & Appelmans, F. Tissue fractionation studies. 6. Intracellular distribution patterns of enzymes in rat-liver tissue. Biochem. J. 60, 604–617 (1955).
Dunkley, T. P., Watson, R., Griffin, J. L., Dupree, P. & Lilley, K. S. Localization of organelle proteins by isotope tagging (LOPIT). Mol. Cell. Proteom. 3, 1128–1134 (2004).
Mulvey, C. M. et al. Using hyperLOPIT to perform high-resolution mapping of the spatial proteome. Nat. Protoc. 12, 1110–1135 (2017).
Liu, X. & Fagotto, F. A method to separate nuclear, cytosolic, and membrane-associated signaling molecules in cultured cells. Sci. Signal. 4, pl2 (2011).
Gatto, L., Breckels, L. M. & Lilley, K. S. Assessing sub-cellular resolution in spatial proteomics experiments. Curr. Opin. Chem. Biol. 48, 123–149 (2019).
Lund-Johansen, F. et al. MetaMass, a tool for meta-analysis of subcellular proteomics data. Nat. Methods 13, 837–840 (2016).
Binder, J. X. et al. COMPARTMENTS: unification and visualization of protein subcellular localization evidence. Database 2014, bau012 (2014).
Orsburn, B. C. Proteome Discoverer—a community enhanced data processing suite for protein informatics. Proteomes 9, 15 (2021).
Tyanova, S., Temu, T. & Cox, J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nat. Protoc. 11, 2301–2319 (2016).
Kong, A. T., Leprevost, F. V., Avtonomov, D. M., Mellacheruvu, D. & Nesvizhskii, A. I. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat. Methods 14, 513–520 (2017).
Holman, J. D., Tabb, D. L. & Mallick, P. Employing ProteoWizard to convert raw mass spectrometry data. Curr. Protoc. Bioinforma. 46, 13.24.1-9 (2014).
Kim, S. & Pevzner, P. A. MS-GF+ makes progress towards a universal database search tool for proteomics. Nat. Commun. 5, 5277 (2014).
Granholm, V. et al. Fast and accurate database searches with MS-GF+Percolator. J. Proteome Res. 13, 890–897 (2014).
Sturm, M. et al. OpenMS – an open-source software framework for mass spectrometry. BMC Bioinforma. 9, 163 (2008).
Savitski, M. M., Wilhelm, M., Hahne, H., Kuster, B. & Bantscheff, M. A scalable approach for protein false discovery rate estimation in large proteomic data sets. Mol. Cell. Proteom. 14, 2394–2404 (2015).
Platt, J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers (eds. Smola, A. J., Bartlett, P., Schölkopf, B. & Schuurmans, D.) (MIT Press, Cambridge, Massachusetts, USA, 1999).
Branca, R. M. et al. HiRIEF LC-MS enables deep proteome coverage and unbiased proteogenomics. Nat. Methods 11, 59–62 (2014).
Bantscheff, M. et al. Robust and sensitive iTRAQ quantification on an LTQ Orbitrap mass spectrometer. Mol. Cell. Proteom. 7, 1702–1713 (2008).
Ow, S. Y., Salim, M., Noirel, J., Evans, C. & Wright, P. C. Minimising iTRAQ ratio compression through understanding LC-MS elution dependence and high-resolution HILIC fractionation. Proteomics 11, 2341–2346 (2011).
Henderson, B. R. Nuclear-cytoplasmic shuttling of APC regulates β-catenin subcellular localization and turnover. Nat. Cell Biol. 2, 653–660 (2000).
Giurgiu, M. et al. CORUM: the comprehensive resource of mammalian protein complexes-2019. Nucleic Acids Res. 47, D559–D563 (2019).
Bastian, M., Heymann, S. & Jacomy, M. Gephi: an open source software for exploring and manipulating networks. Proceedings of the Third International AAAI Conference on Weblogs and Social Media. (The AAAI Press, Menlo Park, California, USA, 2009).

SubCellBarCode: integrated workflow for robust spatial proteomics by mass spectrometry

Extended

Supplementary information

推荐阅读