Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent phospho IDs across different MaxQuant Versions #116

Open
dh2305 opened this issue Oct 2, 2024 · 2 comments
Open

Inconsistent phospho IDs across different MaxQuant Versions #116

dh2305 opened this issue Oct 2, 2024 · 2 comments
Labels

Comments

@dh2305
Copy link

dh2305 commented Oct 2, 2024

I completely understand that different iterations of software like MQ can produce different IDs and quant. values to a certain (minimal) extent.

What I am experiencing now however with a phosphoproteomic data set (DDA PASEF, 36 samples, time course experiment with 3 biological replicates sampled in two phases of a bioprocess with 6 time points each time, 2 replicates 26 27 had initially some injection errors so I reran them afterwards on a new column) is a little bit mindblowing.

I have heard that MQ since 2.5 has improved PTM search integration in Andromeda, especially for more low abundant features (I see in benchmark sets a >50% increase in IDs after filtering). Also, based on investigating benchmark sets with 2.4 and 2.6 versions, phosphosite allocation has become a little bit more stringent. Additionally, I know MBR has possibly become more funky based on limited tests with the new versions.

Anyway, and this is the point I cannot explain why is happening, that this 36 sample dataset has (after filtering) in MQ 2.4.10 a biologically sound and comparable number of site IDs across replicates and all samples, while with 2.6.1 and 2.6.4 some samples completely loose IDs (see below). This also happens on phosphopeptide, peptide and protein levels. Initially, I thought it was a problem with MBR and using 2 samples from an independent run, but no, the error persists if I remove those samples. Also, the samples that are getting close to no IDs vary with the MQ version and they also vary if I include the separately run samples (which brings me back to funky MBR). I also found a bug thread on GitHub where a weird taxonomy ID setting did something similar, but no still persisted (see release for 2.6.5, where this error-producing setting was set off by default now).
I am currently running a search with MBR completely off but we will see. Additionally, I will do a fragpipe search for this phospho set as well.

Any idea why I am experiencing this with 2.6 versions and not with 2.4?

This also represents protein, peptide and phosphopeptide levels, not exclusively for ST phospho sites!

Additionally I searched on 2.6.5 and neither the new version nor disabled MBR changed the problem

I further played (in the GUI and mqpar file) with adjusting the MS1 and MS2 thresholds to the 2.4.10 releases but to no avail. Also tried other variations of the dataset and tried another dataset but the issue persists. Would really appreciate a comment and help from a MQ person here. Thank you very much!

image

@dh2305 dh2305 added the MaxQuant label Oct 2, 2024
@JinqiuXiao
Copy link

Hi there,

Can you please try it out with the latest MaxQuant 2.6.6.0? We fixed a bug in mass recalibration that could reduce the number of identification in timstof dda data, which was introduced between the MQ 2.4.14 and MQ 2.5.0.0.

We regained proper number of identification with two different dataset that both show reduced identification in post MQ 2.4 version.

Thank you very much for letting us know the issue. Please let us know how it works for your dataset :)

Best,
Jinqiu

@dh2305
Copy link
Author

dh2305 commented Dec 5, 2024

Hi Jinqiu,

while this issue persisted in the different versions and approaches I tried above (incl. changing the settings removed from the GUI on the fly in the settings file) to varying degrees (see table in the original post above), it seems to be gone with the 2.6.6 - which I immediately tried once I saw the release :)

The distribution of overall phosphorylation seems to match the one from the "stable" 2.4.10 version (as well as other software solutions like Fragpipe - For transparency, I suppose the performance differences in site identification originate from the powerful rescoring tools inside the Fragpipe workflow... This seems to be reflected more so on site level but also on phosphopeptide level. However, metrics like enrichment efficiency [phosphopeptides/all peptides] are the same between software which makes me very confident in the data itself. All of this was replicated with other in-house and pride phospho datasets as well). Overall, class I IDs seem to have increased drastically, which is always welcomed with joy.

  1. A question that arises is: What work has been done since 2.4.10 to update phospho ID performance?

Please find the updated table below (with the 2.6.6 performance listed on the far right) and I am happy I was of help here! For myself, I am excited to be able to use MQ again parallel to other DDA solutions. As there is no clear consensus in the field of phosphoproteomics yet on the different algorithms for ID/site allocation and rel. quant I do believe combining different approaches for gathering/validating biological insights is crucial.

I am happy to help or discuss this further.

Kind regards,
Dominik

image

1.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants