Identification of PKC-regulated phosphosites on LRRK1 by mass spectrometry analysis
Asad Malik, Raja Sekhar Nirujogi, Toan K. Phung, Dario R. Alessi
Abstract
We describe a non-radioactive, mass spectrometry-based assay that we deploy for identifying novel PKC-regulated sites on LRRK1 that are responsible for activation of its kinase activity.
Attachments
Steps
Preparation of lipid vesicles for PKC activation
Clean a disposable glass culture tube by washing three times with 100% methanol. Allow to air-dry.
Pipette 0.5µL
of Diacylglycerol (stock concentration is 10mg/mL
) and 5µL
of Phosphatidylserine (stock concentration is 10mg/mL
) into the cleaned and dried glass tube.
Vacuum dry lipids using a SpeedVac system for 0h 10m 0s
. This should leave a visible, translucent lipid pellet.
Resuspend lipids from step 3 in 50µL
of 25millimolar (mM)
HEPES 7.4
, 50millimolar (mM)
KCl. Vortex gently until pellet is no longer visible.
Kinase Reaction: Phosphorylation of LRRK1 by PKC
Prepare a primary “2X master mix” containing 50millimolar (mM)
HEPES 7.5
, 100millimolar (mM)
KCl, 0.2% (v/v) 2‐Mercaptoethanol, 20millimolar (mM)
MgCl2, 2millimolar (mM)
ATP, 2millimolar (mM)
CaCl2, 200μg/ml
Phosphatidylserine and 20μg/ml
Diacylglycerol.
For each reaction, add 15µL
of the primary “2X master mix” to a clean Eppendorf tube.
Add 7.5µL
of 200nanomolar (nM)
LRRK1 wild type protein (final concentration is 50nanomolar (nM)
) to each reaction and allow equilibration On ice
for 0h 5m 0s
.
Start the kinase reaction by adding 7.5µL
of 400nanomolar (nM)
PKC Alpha protein (final concentration is 100nanomolar (nM)
).
Transfer the Eppendorf tubes to the thermo mixer set at 30°C
, 1000rpm,0h 0m 0s
. Incubate for 0h 45m 0s
.
Stop the kinase reaction by adding 10µL
of 4X LDS loading buffer to the reaction mix to a final concentration of 1X.
Incubate the samples for 0h 5m 0s
at 70°C
on a heat block before proceeding to SDS-polyacrylamide gel electrophoresis (SDS-PAGE) section.
SDS-polyacrylamide gel electrophoresis (SDS-PAGE):
Load samples onto a NuPAGE 4–12% Bis–Tris Midi Gel (ThermoFisherScientific, Cat#WG1402BOX or Cat#WG1403BOX), alongside pre-stained molecular weight markers (ranging from 10 kDa to 250 kDa). Rinse wells carefully with running buffer before loading samples.
Electrophorese samples at 130V with MOPS SDS running buffer for 2h 0m 0s
or until the blue dye runs off the gel.
Place gel in a clean glass 15 cm dish and cover with 15mL
-20mL
of InstantBlue® Coomassie Protein stain. Incubate on see-saw rocker for 1h 0m 0s
at Room temperature
.
Replace the InstantBlue® Protein stain with double distilled water and allow to de-stain at Room temperature
0h 2m 0s
before proceeding with peptide digestion as described in Total Protein Digestion section.
Total Protein Digestion
Using a clean scalpel, excise stained-bands corresponding to LRRK1 from gel and cut into approximately 1mm2 gel pieces.
Transfer the gel pieces into a low-bind tube.
De-stain gel pieces by repeated 0h 10m 0s
washes in 40% (v/v) ACN in 40millimolar (mM)
NH4HCO3.
Reduce peptides by addition of 100µL
of 5millimolar (mM)
DTT in 40millimolar (mM)
NH4HCO3. Incubate on thermomixer at 56°C
for 0h 30m 0s
, 1200rpm,0h 0m 0s
.
Remove the DTT solution and incubate gel pieces in 40% (v/v) ACN in 40millimolar (mM)
NH4HCO3for 0h 10m 0s
at Room temperature
?
Alkylate peptides by addition of 20millimolar (mM)
iodoacetamide in 40millimolar (mM)
NH4HCO3 and incubate at Room temperature
for 0h 30m 0s
, 1200rpm,0h 0m 0s
.
Dehydrate gel pieces by washing in 100% (v/v) ACN for 0h 10m 0s
.
Remove supernatant using a pipette and vacuum dry gel pieces to remove any residual CAN.
Add 100ng
of protease in 100µL
of appropriate buffer (See Table 1) to the gel pieces from step 23 and incubate 0h 10m 0s
on thermomixer at 37°C
, 1200rpm,0h 0m 0s
.
A | B |
---|---|
Protease | Buffer |
Trypsin + LysC | 50 mM TEABC |
Asp-N | 50 mM Tris-HCl |
Chymotrypsin | 100 mM Tris-HCl + 10 mM CaCl2 |
Table 1: Protease combinations used for total protein digestion and appropriate buffers for each protease.
Peptide extraction
Supplement samples from step 24 with 50µL
of extraction buffer (80% ACN in 0.2% Formic Acid) and incubate on thermomixer at Room temperature
for 0h 10m 0s
at 1200rpm,0h 0m 0s
.
Centrifuge samples for 0h 1m 0s
at 2000x g,0h 0m 0s
to pellet the gel pieces and using a pipette carefully transfer the supernatant to a new low-binding? tube.
Repeat step 25 until the gel pieces appear completely dried. Each time, transfer the supernatant into the same tube (from step 26).
Vacuum dry the combined supernatants (containing the digested peptides) and proceed with C18 clean-up protocol (as described in C18 stage-tip protocol section).
C18 stage-tip protocol:
Prepare single layer of C18 stage-tip using 16-gauge syringe
Resuspend the vacuum dried peptides from step 28 in 80µL
of Solvent A1 (0.1% (by vol) TFA in MQ-H2O).
Add 80µL
of 100% (by vol) ACN to the C18 stage-tip from Step 29 and centrifuge at 2000x g,0h 0m 0s
for 0h 2m 0s
at Room temperature
. Discard flow through.
Add 80µL
Solvent A1 (0.1% (by vol) TFA (by vol) in MQ-H2O)) and centrifuge at 2000x g,0h 0m 0s
for 0h 2m 0s
at Room temperature
. Discard flow through. Repeat this step.
Load the acidified peptide digest from Step 30 to the C18 stage-tip from step 32 and centrifuge at 1500x g,0h 0m 0s
for 0h 5m 0s
at Room temperature
.
Reapply the flow through to the C18 stage-tip column and centrifuge at 1500x g,0h 0m 0s
for 0h 5m 0s
at Room temperature
.
Add 80µL
of Solvent A1 (0.1% (by vol) TFA v/v) in MQ-H2O)?) to the C18 stage-tip column and centrifuge at 2000x g,0h 0m 0s
for 0h 2m 0s
at Room temperature
. Discard flow through. Repeat again.
Place the C18 stage-tip from step 35 into a new 1.5 ml low binding tube.
Elute peptides from the C18 stage-tip by adding 40µL
of Elution buffer (Solvent B1: 40% (by vol) acetonitrile in 0.1% (by vol) TFA) in MQ-H2O and centrifuge at 1500x g,0h 0m 0s
for 0h 2m 0s
.
Repeat step 37.
Elute peptides from the C18 stage-tip by adding 40µL
of Elution buffer (Solvent B1: 40% (by vol) acetonitrile in 0.1% (by vol) TFA) in MQ-H2O and centrifuge at 1500x g,0h 0m 0s
for 0h 2m 0s
.
Immediately snap freeze the eluted peptides from step 38 on dry ice and vacuum dry.
Perform mass spectrometry analysis of the peptides as described in LC-MS/MS analysis section.
LC-MS/MS analysis
Dissolve the peptides in LC-Buffer (3% ACN (v/v) in 0.1% Formic acid (v/v)).
Take 200ng
of the peptide digest of LRRK2 in 5µL
or 10µL
in LC-buffer and prepare it for the Evotips loading. The Evo tips are a versatile disposable trap columns that enables <0.1% carry-over between samples.
Prepare the Evotips as described in the Protocol in PMID: 33367571.
Place the Evotips on EvoSep autosampler and used the 30 sample per day (30SPD) method to execute the LC method through Xcalibur interface that is inline with Orbitrap Exploris 240 mass spectrometer.
EvoSep LC system injects and executes a partial elution of the sample from Evotip and loads onto the long storage loop in which the pre-formed gradient generated at the initial step. Following the loading the High-pressure pump pushes the sample into the analytical column (ReproSil-Pur C18, 1.9 µm beads by Dr Maisch. #EV1113).
The following MS instrument method can be constructed for the High-resolution HCD fragmentation analysis:
A | B | C |
---|---|---|
Instrument | Thermo Scientific Orbitrap Exploris 240 | |
LC system | EvoSep Liquid Chromatography system | 30 SPD method |
Method duration | 45 min | |
MS Global settings: | ||
Infusion mode: | Liquid Chromatography | |
Expected LC peak width (s): | 15 | |
Advanced Peak determination: | TRUE | |
Default charge state: | 2 | |
Internal mass calibration: | off | |
Full scan settings: | ||
Orbitrap resolution: | 120000 | |
Scan range (m/z): | 375-1500 | |
RF lens(%): | 70 | |
AGC target: | Custom | |
Normalized AGC target (%): | 300 | |
Maximum injection Time mode: | Custom | |
Maximum injection Time (ms): | 25 | |
Micorscans: | 1 | |
Data type: | Profile | |
Polarity: | Positive | |
Filters: | ||
MIPS | Monoisotopic peak determination: | Peptide |
Relax restrictions when too few precursors are found: | TRUE | |
Intensity | Filter Type: | Intensity Threshold |
Intensity Threshold: | 5.00E+03 | |
Charge State | Include charge state(s): | 2 to 6 |
Include undetermined charge states: | False | |
Dynamic Exclusion | Dynamic Exclusion Mode: | Custom |
Exclude after n times: | 1 | |
Exclusion duration (s): | 5 | |
Mass Tolerance: | ppm | |
Low: | 10 | |
High | 10 | |
Exclude isotopes: | TRUE | |
Perform dependent scan on single charge state per precursor only: | FALSE | |
Data Dependent | Data Dependent Mode: | Number of Scans |
Number of Dependent Scans | 10 | |
ddMS2 settings | Isolation Window (m/z): | 1.2 |
Isolation Offset: | Off | |
Collision Energy Mode: | Fixed | |
Collision Energy Type: | Normalized | |
HCD Collision Energy (%): | 28 | |
Orbitrap resolution: | 15000 | |
First Mass (m/z): | 110 | |
Scan range mode: | Auto | |
AGC target: | Standard | |
Maximum injection Time mode: | Custom | |
Maximum injection Time (ms): | 100 | |
Micorscans: | 1 | |
Data type: | Profile | |
Polarity: | Positive |
Data analysis
Transfer the raw data to search with Thermo Scientific Proteome Discoverer 2.4 Software suite that is integrated with Sequest-HT search algorithm.
We recommend creating a custom protein sequence FASTA file rather than using the entire Uniprot Human or Mouse proteome FASTA file. For example: Copy the Human LRRK1 FASTA sequence and past it into a Notepad++ and save with LRRK1.FASTA .
Import the LRRK1.FASTA sequence into the PD 2.4 software.
Construct the Processing and Consensus workflows
A | B | C |
---|---|---|
------------------------------------------------------------------ | ||
The Processing workflow tree | ||
------------------------------------------------------------------ | ||
(0) Spectrum Files | ||
(1) Spectrum Selector | ||
(2) Sequest HT | ||
(3) Fixed Value PSM Validator | ||
(4) IMP-ptmRS | ||
(5) Minora Feature Detector | ||
------------------------------------------------------------------ | ||
Processing node 0 | Spectrum Files | |
------------------------------------------------------------------ | ||
Input Data | Note | |
File Name(s) | Specify the sample condtion and the Enyzme associated with the digestion | |
RN-AM_211216_LRRK1_+PKC_Tryp-LysC_01.raw | ||
RN-AM_211216_LRRK1_+PKC_Tryp-LysC_01.raw | ||
RN-AM_211216_LRRK1_-PKC_Tryp-LysC_01.raw | ||
RN-AM_211216_LRRK1_-PKC_Tryp-LysC_01.raw | ||
------------------------------------------------------------------ | ||
Processing node 1 | Spectrum Selector | |
------------------------------------------------------------------ | ||
1. General Settings | ||
Precursor Selection | Use MS1 Precursor | |
Use Isotope Pattern in Precursor Reevaluation | True | |
Provide Profile Spectra | Automatic | |
2. Spectrum Properties Filter | ||
Lower RT Limit | 0 | |
Upper RT Limit | 0 | |
First Scan | 0 | |
Last Scan | 0 | |
Lowest Charge State | 0 | |
Highest Charge State | 0 | |
Min. Precursor Mass | 350 Da | |
Max. Precursor Mass | 5000 Da | |
Total Intensity Threshold | 0 | |
Minimum Peak Count | 1 | |
3. Scan Event Filters | ||
Mass Analyzer | Is FTMS | |
MS Order | Is MS2; MS1 | |
Activation Type | Is HCD | |
Min. Collision Energy | 0 | |
Max. Collision Energy | 1000 | |
Scan Type | Is Full | |
Polarity Mode | Is + | |
4. Peak Filters | ||
- S/N Threshold (FT-only) | 1.5 | |
5. Replacements for Unrecognized Properties | ||
Unrecognized Charge Replacements | Automatic | |
Unrecognized Mass Analyzer Replacements | FTMS | |
Unrecognized MS Order Replacements | MS2 | |
Unrecognized Activation Type Replacements | HCD | |
Unrecognized Polarity Replacements | + | |
Unrecognized MS Resolution@200 Replacements | 120000 | |
Unrecognized MSn Resolution@200 Replacements | 30000 | |
6. Precursor Pattern Extraction | ||
Precursor Clipping Range Before | 2.5 Da | |
5.5 Da | ||
------------------------------------------------------------------ | ||
Processing node 2 | Sequest HT | |
------------------------------------------------------------------ | ||
1. Input Data | ||
Protein Database | LRRK1.FASTA | |
Enzyme Name | Trypsin (Full) | Here, specify AspN and Chymotrypsin separately fof the searches associated with those conditions |
Max. Missed Cleavage Sites | 2 | |
Min. Peptide Length | 7 | |
Max. Peptide Length | 144 | |
Max. Number of Peptides Reported | 10 | |
2. Tolerances | ||
Precursor Mass Tolerance | 10 ppm | |
Fragment Mass Tolerance | 0.05 Da | |
Use Average Precursor Mass | False | |
Use Average Fragment Mass | False | |
3. Spectrum Matching | ||
Use Neutral Loss a Ions | True | |
Use Neutral Loss b Ions | True | |
Use Neutral Loss y Ions | True | |
Use Flanking Ions | True | |
Weight of a Ions | 0 | |
Weight of b Ions | 1 | |
- Weight of c Ions | 0 | |
Weight of x Ions | 0 | |
Weight of y Ions | 1 | |
Weight of z Ions | 0 | |
4. Dynamic Modifications | ||
Max. Equal Modifications Per Peptide | 3 | |
Max. Dynamic Modifications Per Peptide | 4 | |
- 1. Dynamic Modification | Oxidation / +15.995 Da (M) | |
- 2. Dynamic Modification | Phospho / +79.966 Da (S, T, Y) | |
7. Static Modifications | ||
- 1. Static Modification | Carbamidomethyl / +57.021 Da (C) | |
------------------------------------------------------------------ | ||
Processing node 3 | Fixed Value PSM Validator | |
------------------------------------------------------------------ | ||
1. Input Data | ||
Maximum Delta Cn | 0.05 | |
Maximum Rank | 0 | |
------------------------------------------------------------------ | ||
Processing node 4 | IMP-ptmRS | |
------------------------------------------------------------------ | ||
1. Scoring | ||
PhosphoRS Mode | True | |
Report only PTMs | True | |
Use Diagnostic Ions | True | |
Use Fragment Mass Tolerance of Search Node | True | |
Fragment Mass Tolerance | 0.5 Da | |
Consider Neutral Loss peaks for CID, HCD and EThcD | Automatic | |
Maximum Peak Depth | 8 | |
Use a Mass accuracy correction | False | |
2. Performance | ||
Maximum Number of Position Isoforms | 500 | |
Maximum PTMs Per Peptide | 10 | |
------------------------------------------------------------------ | ||
Processing node 5 | Minora Feature Detector | |
------------------------------------------------------------------ | ||
1. Peak & Feature Detection | ||
Min. Trace Length | 5 | |
- Max. ΔRT of Isotope Pattern Multiplets [min] | 0.2 | |
2. Feature to ID Linking | ||
PSM Confidence At Least | High |
A | B |
---|---|
The Consensus workflow tree | |
------------------------------------------------------------------ | |
(0) MSF Files | |
(1) PSM Grouper | |
(2) Peptide Validator | |
(3) Peptide and Protein Filter | |
(4) Protein Scorer | |
(5) Protein Grouping | |
(6) Peptide in Protein Annotation | |
(15) Modification Sites | |
(7) Protein FDR Validator | |
(16) Peptide Isoform Grouper | |
(10) Feature Mapper | |
(11) Precursor Ions Quantifier | |
Post-processing nodes | |
-------------------------------- | |
(12) Result Statistics | |
(13) Display Settings | |
(14) Data Distributions | |
------------------------------------------------------------------ | |
Processing node 0 | MSF Files |
------------------------------------------------------------------ | |
1. Storage Settings | |
Spectra to Store | Identified or Quantified |
Feature Traces to Store | All |
2. Merging of Identified Peptide and Proteins | |
Merge Mode | Globally by Search Engine Type |
3. FASTA Title Line Display | |
Reported FASTA Title Lines | Best match |
Title Line Rule | standard |
4. PSM Filters | |
Maximum Delta Cn | 0.05 |
Maximum Rank | 0 |
Maximum Delta Mass | 0 ppm |
Hidden Parameters | |
MSF File(s) | RN-AM_211216_LRRK1_Sequest-Trypsin-(1).msf |
------------------------------------------------------------------ | |
Processing node 1 | PSM Grouper |
------------------------------------------------------------------ | |
1. Peptide Group Modifications | |
Site Probability Threshold | 75 |
------------------------------------------------------------------ | |
Processing node 2 | Peptide Validator |
------------------------------------------------------------------ | |
1. General Validation Settings | |
Validation Mode | Automatic (Control peptide level error rate if possible) |
Target FDR (Strict) for PSMs | 0.01 |
Target FDR (Relaxed) for PSMs | 0.05 |
Target FDR (Strict) for Peptides | 0.01 |
Target FDR (Relaxed) for Peptides | 0.05 |
2. Specific Validation Settings | |
Validation Based on | q-Value |
Target/Decoy Selection for PSM Level FDR Calculation Based on Score | Automatic |
Reset Confidences for Nodes without Decoy Search (Fixed Score thresholds) | False |
------------------------------------------------------------------ | |
Processing node 3 | Peptide and Protein Filter |
------------------------------------------------------------------ | |
1. Peptide Filters | |
Peptide Confidence At Least | High |
Keep Lower Confident PSMs | False |
Minimum Peptide Length | 7 |
Remove Peptides without Protein Reference | False |
2. Protein Filters | |
Minimum Number of Peptide Sequences | 1 |
Count Only Rank 1 Peptides | False |
Count Peptides only for Top Scored Protein | False |
------------------------------------------------------------------ | |
Processing node 4 | Protein Scorer |
------------------------------------------------------------------ | |
No parameters | |
------------------------------------------------------------------ | |
Processing node 5 | Protein Grouping |
------------------------------------------------------------------ | |
1. Protein Grouping | |
Apply Strict parsimony principle | True |
------------------------------------------------------------------ | |
Processing node 6 | Peptide in Protein Annotation |
------------------------------------------------------------------ | |
1. Flanking Residues | |
Annotate Flanking Residues of the Peptide | True |
Number Flanking Residues in Connection Tables | 1 |
2. Modifications in Peptide | |
Protein Modifications Reported | Only for Master Proteins |
3. Modifications in Protein | |
Modification Sites Reported | All And Specific |
Minimum PSM Confidence | High |
Report only PTMs | True |
4. Positions in Protein | |
Protein Positions for Peptides | Only for Master Proteins |
------------------------------------------------------------------ | |
Processing node 15 | Modification Sites |
------------------------------------------------------------------ | |
1. General | |
Report only PTMs | True |
only Master Proteins | True |
Motif Radius | 10 |
------------------------------------------------------------------ | |
Processing node 7 | Protein FDR Validator |
------------------------------------------------------------------ | |
1. Confidence Thresholds | |
Target FDR (Strict) | 0.01 |
Target FDR (Relaxed) | 0.05 |
------------------------------------------------------------------ | |
Processing node 16 | Peptide Isoform Grouper |
------------------------------------------------------------------ | |
No parameters | |
------------------------------------------------------------------ | |
Processing node 10 | Feature Mapper |
------------------------------------------------------------------ | |
1. Chromatographic Alignment | |
Perform RT Alignment | True |
- Maximum RT Shift [min] | 10 |
Mass Tolerance | 10 ppm |
Parameter Tuning | Coarse |
2. Feature Linking and Mapping | |
RT Tolerance [min] | 0 |
Mass Tolerance | 0 ppm |
Min. s/N Threshold | 5 |
------------------------------------------------------------------ | |
Processing node 11 | Precursor Ions Quantifier |
------------------------------------------------------------------ | |
1. General Quantification Settings | |
Peptides to Use | Unique + Razor |
Consider Protein Groups for Peptide Uniqueness | True |
Use Shared Quan Results | True |
Reject Quan Results with Missing Channels | False |
2. Precursor Quantification | |
Precursor Abundance Based on | Intensity |
Min. # Replicate Features [%] | 0 |
3. Normalization and Scaling | |
Normalization Mode | Total Peptide Amount |
Scaling Mode | On All Average |
4. Exclude Peptides from Protein Quantification | |
for Normalization | Use All Peptides |
for Protein Roll-Up | Use All Peptides |
for Pairwise Ratios | Exclude Modified |
5. Quan Rollup and Hypothesis Testing | |
Protein Abundance Calculation | Summed Abundances |
N for Top N | 3 |
Protein Ratio Calculation | Pairwise Ratio Based |
Maximum Allowed Fold Change | 100 |
Imputation Mode | None |
Hypothesis Test | t-test (Background Based) |
6. Quan Ratio Distributions | |
- 1st Fold Change Threshold | 2 |
- 2nd Fold Change Threshold | 4 |
- 3rd Fold Change Threshold | 6 |
- 4th Fold Change Threshold | 8 |
- 5th Fold Change Threshold | 10 |
If the database search is to be done using MaxQuant then refer below settings
A | B |
---|---|
Parameter | Value |
Version | 2.0.3.0 |
User name | RNirujogi |
Machine name | MRC-MS-R640-4 |
Date of writing | 05/23/2022 15:15:41 |
Include contaminants | TRUE |
PSM FDR | 0.01 |
SM FDR Crosslink | 0.01 |
Protein FDR | 0.01 |
Site FDR | 0.01 |
Use Normalized Ratios For Occupancy | TRUE |
Min. peptide Length | 7 |
Min. score for unmodified peptides | 0 |
Min. score for modified peptides | 40 |
Min. delta score for unmodified peptides | 0 |
Min. delta score for modified peptides | 6 |
Min. unique peptides | 0 |
Min. razor peptides | 1 |
Min. peptides | 1 |
Use only unmodified peptides and | TRUE |
Modifications included in protein quantification | Oxidation (M);Acetyl (Protein N-term);Deamidation (NQ) |
Peptides used for protein quantification | Razor |
Discard unmodified counterpart peptides | TRUE |
Label min. ratio count | 2 |
Use delta score | FALSE |
iBAQ | FALSE |
iBAQ log fit | FALSE |
Match between runs | FALSE |
Find dependent peptides | FALSE |
Fasta file | C:\Raja\Database\LRRK1.FASTA |
Decoy mode | revert |
Include contaminants | TRUE |
Advanced ratios | TRUE |
Fixed andromeda index folder | |
Combined folder location | |
Second peptides | TRUE |
Stabilize large LFQ ratios | TRUE |
Separate LFQ in parameter groups | FALSE |
Require MS/MS for LFQ comparisons | TRUE |
Calculate peak properties | FALSE |
Main search max. combinations | 200 |
Advanced site intensities | TRUE |
Write msScans table | FALSE |
Write msmsScans table | TRUE |
Write ms3Scans table | TRUE |
Write allPeptides table | TRUE |
Write mzRange table | TRUE |
Write DIA fragments table | FALSE |
Write DIA fragments quant table | FALSE |
Write pasefMsmsScans table | TRUE |
Write accumulatedMsmsScans table | TRUE |
Max. peptide mass [Da] | 4600 |
Min. peptide length for unspecific search | 8 |
Max. peptide length for unspecific search | 25 |
Razor protein FDR | TRUE |
Disable MD5 | FALSE |
Max mods in site table | 3 |
Match unidentified features | FALSE |
Epsilon score for mutations | |
Evaluate variant peptides separately | TRUE |
Variation mode | None |
MS/MS tol. (FTMS) | 20 ppm |
Top MS/MS peaks per Da interval. (FTMS) | 12 |
Da interval. (FTMS) | 100 |
MS/MS deisotoping (FTMS) | TRUE |
MS/MS deisotoping tolerance (FTMS) | 7 |
MS/MS deisotoping tolerance unit (FTMS) | ppm |
MS/MS higher charges (FTMS) | TRUE |
MS/MS water loss (FTMS) | TRUE |
MS/MS ammonia loss (FTMS) | TRUE |
MS/MS dependent losses (FTMS) | TRUE |
MS/MS recalibration (FTMS) | FALSE |
MS/MS tol. (ITMS) | 0.5 Da |
Top MS/MS peaks per Da interval. (ITMS) | 8 |
Da interval. (ITMS) | 100 |
MS/MS deisotoping (ITMS) | FALSE |
MS/MS deisotoping tolerance (ITMS) | 0.15 |
MS/MS deisotoping tolerance unit (ITMS) | Da |
MS/MS higher charges (ITMS) | TRUE |
MS/MS water loss (ITMS) | TRUE |
MS/MS ammonia loss (ITMS) | TRUE |
MS/MS dependent losses (ITMS) | TRUE |
MS/MS recalibration (ITMS) | FALSE |
MS/MS tol. (TOF) | 40 ppm |
Top MS/MS peaks per Da interval. (TOF) | 10 |
Da interval. (TOF) | 100 |
MS/MS deisotoping (TOF) | TRUE |
MS/MS deisotoping tolerance (TOF) | 0.01 |
MS/MS deisotoping tolerance unit (TOF) | Da |
MS/MS higher charges (TOF) | TRUE |
MS/MS water loss (TOF) | TRUE |
MS/MS ammonia loss (TOF) | TRUE |
MS/MS dependent losses (TOF) | TRUE |
MS/MS recalibration (TOF) | FALSE |
MS/MS tol. (Unknown) | 20 ppm |
Top MS/MS peaks per Da interval. (Unknown) | 12 |
Da interval. (Unknown) | 100 |
MS/MS deisotoping (Unknown) | TRUE |
MS/MS deisotoping tolerance (Unknown) | 7 |
MS/MS deisotoping tolerance unit (Unknown) | ppm |
MS/MS higher charges (Unknown) | TRUE |
MS/MS water loss (Unknown) | TRUE |
MS/MS ammonia loss (Unknown) | TRUE |
MS/MS dependent losses (Unknown) | TRUE |
MS/MS recalibration (Unknown) | FALSE |
Site tables | Deamidation (NQ)Sites.txt;Oxidation (M)Sites.txt;Phospho (ST)Sites.txt |
Data analysis and Visualization
Manually verify the MS/MS spectrum and phosphorylation localization score within PD2.4.
Now export the filtered Phosphosites from modifications table for each of the sample/category
Use the below scripts for parsing and combining the data to generate a heatmap representation.
The script below would first read phosphosite mapping result, then map them on to the original protein amino acid sequence through combining PeptideGroups and ModificationSites result text file. The data would be filtered by probability greater or equal to 75 and grouped by the different tryptic digestion enzymes used. Only entries with the highest abundance values according to the unique motif, position and sample condition are kept. Then based on the sequence length, the data was divided into instances of 500 amino acid continuous span on the protein sequence. Each of these instances would be used to create a heatmap where the abundance of the peptide would be the heatmap color, the sample condition would be presented on the X-axis while the position of the phosphosites are represented in the Y-axis in ascending order.
import numpy as np
import pandas as pd
from glob import glob
import re
import seaborn as sns
import matplotlib.pylab as plt
if __name__ == "__main__":
proteases = ["AspN", "Chymotrypsin",
#"Trypsin"
]
files = ["PeptideGroups", "ModificationSites"]
phospho_re = re.compile(r"Phospho [S(\d+)\((\d+)\)]")
results = {}
for i in glob(r"\\mrc-smb.lifesci.dundee.ac.uk\mrc-group-folder\ALESSI\Toan\TS22D4_Phosphosite mapping_02\*.txt"):
for p in proteases:
if p in i:
for f in files:
if f in i:
if p not in results:
results[p] = {}
results[p][f] = pd.read_csv(i, sep="\t")
break
break
merged_df = []
columns = set()
for p in proteases:
pg = results[p][files[0]]
ms = results[p][files[1]]
for i, r in pg.iterrows():
pg.at[i, "Primary IDs"] = ";".join([r["Master Protein Accessions"], r["Annotated Sequence"][4:len(r["Annotated Sequence"])-4]])
phos = []
s = re.search("\[(\d+)-(\d+)\]", r["Positions in Master Proteins"])
pos = []
if s:
pg.at[i, "Start"] = s.group(1)
mod_count = r["Modifications"].count("]; ")
if mod_count > 0:
for m in r["Modifications"].split("]; "):
if "Phospho" in m:
s = re.search("\[(.+)", m)
if s:
for si in s.group(1).split("; "):
sire = re.search("(\w)(\d+)\(", si)
if sire:
phos.append("".join([sire.group(1), sire.group(2)]))
pos.append(str(int(sire.group(2)) + int(pg.at[i, "Start"]) - 1))
else:
if "Phospho" in r["Modifications"]:
s = re.search("\[(.+)", r["Modifications"])
if s:
for si in s.group(1).split("; "):
sire = re.search("(\w)(\d+)\(", si)
if sire:
phos.append("".join([sire.group(1), sire.group(2)]))
pos.append(str(int(sire.group(2)) + int(pg.at[i, "Start"]) - 1))
pg.at[i, "Position"] = pos
pg.at[i, "Phospho"] = phos
pg = pg.explode(["Phospho", "Position"])
pg = pg[pd.notnull(pg["Phospho"])]
pg["Position"] = pg["Position"].astype(int)
for i, r in ms.iterrows():
ms.at[i, "Primary IDs"] = ";".join([r["Protein Accession"], r["Peptide Sequence"]])
rpg = pg[[i for i in pg.columns if i.startswith("Abundance")] + ["Primary IDs", "Phospho", "Position", "Modifications"]]
rename = {}
for i in rpg.columns:
if "Abundance" in i:
rename[i] = re.sub("Abundance: F\d+: Sample, ", "", i)
columns.add(rename[i])
print(rpg["Primary IDs"])
print(ms["Primary IDs"])
rpg = rpg.rename(columns=rename)
ms["Phospho"] = ms["Target Amino Acid"] + ms["Position in Peptide"].astype(str)
ms["Enzymes"] = p
df = ms.merge(rpg, left_on=["Primary IDs", "Phospho"], right_on=["Primary IDs", "Phospho"])
merged_df.append(df)
merged_df = pd.concat(merged_df, ignore_index=True)
merged_df = merged_df[merged_df["Site Probability"]>=75]
result = pd.melt(merged_df, id_vars=[
"Phospho", "Position_y", "Enzymes", "Motif"], value_vars=list(columns),
var_name="Samples", value_name="Abundance")
a = result.groupby([
#"Phospho",
"Position_y", "Samples", "Enzymes", "Motif"]).max()
a.reset_index(inplace=True)
print(a["Samples"])
a["Conditions"], a["Replicates"] = a["Samples"].str.split("Rep-", expand=True)
for i, g in a.groupby([
# "Phospho",
"Position_y", "Motif"]):
remove_motif = True
for i2, g2 in g.groupby(["Enzymes", "Conditions"]):
if len(g2[pd.notnull(g2["Abundance"])].index) > 1:
remove_motif = False
break
if remove_motif:
a["Motif"].loc[g.index] = ""
a.sort_values("Position_y", inplace=True)
e = 1
n = 500
samples = a["Samples"].unique()
samples_columns = []
for p in proteases:
for s in samples:
samples_columns.append((p, s))
multiindex = pd.MultiIndex.from_tuples(samples_columns, names=["Enzymes", "Samples"])
while n:
c = a[(a["Position_y"] <= n)&(a["Position_y"] > (n-500))]
fontsize_pt = plt.rcParams['ytick.labelsize']
dpi = 72.27
top_margin = 0.2
bottom_margin = 0.2
left_margin = 0.2
right_margin = 0.2
figure_height = (len(c.index)/10) / (1 - top_margin - bottom_margin)
figure_width = 10 / (1-left_margin-right_margin)
c = c.set_index([
#"Phospho",
"Position_y", "Samples", "Enzymes", "Motif"])
c = c.unstack("Enzymes")
b = pd.pivot_table(c, values="Abundance", columns="Samples", index=["Position_y",
#"Phospho",
"Motif"])
b.fillna(0, inplace=True)
b = b.T
for i in b.columns:
b0 = b[i][b[i]==0]
b[i] = (np.log2(b[i], where=b[i]>0) - np.log2(b[i], where=b[i]>0).mean()) / np.log2(b[i], where=b[i]>0).std(ddof=1)
for ind in b0.index:
b[i].loc[ind] = np.nan
b = b.T
new_df = pd.DataFrame(index=b.index, columns=multiindex)
for i in new_df.columns:
if i in b.columns:
new_df[i] = b[i]
else:
new_df[i].fillna(0, inplace=True)
new_df.to_csv(f"merged{n}.csv")
fig, ax = plt.subplots(
figsize=(figure_width, figure_height),
gridspec_kw=dict(top=1-top_margin, bottom=bottom_margin, left=left_margin, right=1-right_margin)
)
mask = np.isnan(b)
sns.heatmap(new_df, cmap="YlGnBu", mask=mask, square=True, ax=ax)
ax.set_facecolor("silver")
ax.xaxis.tick_top()
ax.xaxis.set_label_position('top')
for label in ax.get_yticklabels():
label.set_weight("bold")
for label in ax.get_xticklabels():
label.set_weight("bold")
plt.xticks(rotation=90)
plt.savefig(f"result{n}.pdf")
for i, r in b.iterrows():
if i[1] != "":
p = re.compile(r"[RK]\w[ts]\w\w[RK]")
s = re.search(p, i[1])
if s:
print(i)
n += 500
e += 1
if n >= a["Position_y"].max():
break