Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of SugarSelection #4790

Draft
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

talagayev
Copy link
Member

@talagayev talagayev commented Nov 14, 2024

Fixes #4563

Changes made in this Pull Request:

  • Creation of the SugarSelection class that allows the selection of sugars through
    the access of known PDB, CHARMM and GLYCAM abbreviations
  • Addition of GLYCAM and SUGAR_PDB files in MDAnalysisTest.data
  • Addition of test_sugar_glycam_selection() and test_sugar_pdb_selection
    in test_atomselections.py

Currently I used the following abbreviations:

https://glycam.org/docs/othertoolsservice/2016/06/09/3d-snfg-list-of-residue-names/index.html

In addition of using the GLYCAM Webserver to obtain the known Sugar abbreviations and also
included the aglycans that I obtained from the GLYCAM Weberserver.

The Pytest Files were retrieved from RCSB-PDB and the GLYCAM-Webserver:

https://glycam.org/

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4790.org.readthedocs.build/en/4790/

Implementation of SugarSelection with the known abbreviations and aglycans obtained from the glycam webserver
@pep8speaks
Copy link

pep8speaks commented Nov 14, 2024

Hello @talagayev! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-11-15 22:28:53 UTC

Copy link

codecov bot commented Nov 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.64%. Comparing base (b254921) to head (4cf52a4).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4790      +/-   ##
===========================================
- Coverage    93.66%   93.64%   -0.03%     
===========================================
  Files          177      189      +12     
  Lines        21742    22816    +1074     
  Branches      3055     3055              
===========================================
+ Hits         20365    21366    +1001     
- Misses         930     1003      +73     
  Partials       447      447              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@talagayev
Copy link
Member Author

talagayev commented Nov 15, 2024

@hmacdope I will ping you here since you were one of the participants in the discussion that initiated the Issue that this PR is covering.

The main problem that I see currently, would be that the GLYCAM Abbreviations for the sugars due to the combinations have quite a lot of different names/abbreviations.

Some of them could lead to tricky cases, with the Allose Nomenclature having RNA as one of the abbreviations.

I checked the RCSB-PDB and found only one case, where a unique ligand was called RNA, but still could be dangerous if the
users call something "RNA" in their files.

As for the coverage, while looking at the PDBs in RCSB-PDB it covers NAG, GLC etc., which would be convenient if somebody wants to select those in the PDB Files, but does not cover for example Glycerol, which makes sense since it is not a sugar, but is quite common in crystal structures among those sugars as NAG, GLC. Does it make sense to have a selection that would somehow covers both cases of Glycerol and similar compounds together with sugars?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create sugar or carbohydrate selection using GLYCAM nomenclature
2 participants