Allow select_atoms to select chain #2875

xiki-tempula · 2020-07-28T13:33:28Z

Is your feature request related to a problem?

The PDB standard defined location 22 as chain ID.
The charmm standard defined the segment id being a 4 letter ID starting at 73.

Currently, mda assumes that the segment id is chain id when segment id is in the absence and will ignore the chain id when the segment id is given.

Ideally one could select chain based on chainid.

u.select_atoms('chainid A') or u.select_atoms('chain A') if we do it in the pymol way

Related to #2874

The text was updated successfully, but these errors were encountered:

orbeckst · 2020-07-30T22:29:43Z

Chain vs Segment

chain

A chain is a polymer term, specifically, from PDB files (ATOM chainId and see TER and SEQRES for clarification) and originally means one polymer, as expressed for SEQURES (my emphasis)

SEQRES records contain a listing of the consecutive chemical components covalently linked in a linear fashion to form a polymer. The chemical components included in this listing may be standard or modified amino acid and nucleic acid residues. It may also include other residues that are linked to the standard backbone in the polymer. Chemical components or groups covalently linked to side-chains (in peptides) or sugars and/or bases (in nucleic acid polymers) will not be listed here.

Each SEQRES entry has a corresponding chainId in ATOM records and should be terminated with a TER (although in the wild this is often omitted).

Segment

A segment originates (as far as I know) from PSF files and is generally used to mark up a collection of molecules. This is often used to label single proteins or all lipids or all waters or the whole solvent. The charmmtutorial.org: CHARMM:The Basics: Molecule Metadata treats "chain" and "segment" as equivalent

Residues are further grouped into chains, or segments, which represent major functional units of the protein.

but then shows an example where all water molecules are in a segment with SEGID W.

In practice, segments are used as a convenient container for collections of "residues", where residues can either be building blocks of a polymer or individual molecules such as lipids or waters or bare ions.

selection keyword

A quick survey indicates that chain is probably a good keyword to use.

VMD

VMD's selections have the keywords

chain (str): the one-character chain identifier
fragment (num): a set of connected residues
segname (str): segment name

CHARMM

MDAnalysis selections were modelled after CHARMM so unsurprisingly (see charmmtutorial.org: Atom Selection and c42b1 select

segid (num): segment with numerical segment ID

PyMOL

See pymolwiki.org: Selection_Algebra

chain (char): Chain identifier
segi (char): Segment identifier (label_asym_id from mmCIF)
model (str): Atoms from object "1ubq" (e.g., "model 1ubq")

Related operators

bysegi (expr): Expands expr to complete segments
bychain (expr): Expands expr to complete chains
bymolecule (expr): Expands expr to complete molecules (connected with bonds)
byfragment (expr): ?

mdtraj

mdtraj does not seem to store chains/segids, at least based on mdtraj: Atom Selection Reference it only lets users select the internal chainid :

chainid (num): Chain index (0-based)

Feel free to correct me on any of the above.

Fixes #2925 Fixes #2875 Fixes #3054 Changes made in this Pull Request: - added a class factory to subclass `core.selection.Selection` for each TopologyAttr - added tokens to `core.selection.SameSelection` - added `FloatRangeSelection` and `BoolSelection` - added negatives, scientific notation and "to" delimiter for ranges

* Add arbitrary TopologyAttr selection (MDAnalysis#2927) Fixes MDAnalysis#2925 Fixes MDAnalysis#2875 Fixes MDAnalysis#3054 Changes made in this Pull Request: - added a class factory to subclass `core.selection.Selection` for each TopologyAttr - added tokens to `core.selection.SameSelection` - added `FloatRangeSelection` and `BoolSelection` - added negatives, scientific notation and "to" delimiter for ranges * Add ReadTheDocs configuration for PR builds (MDAnalysis#3060) - Adds RTD configuration - Add `environment.yml` for package installation * Remove appveyor * Install MDAnalysis on ReadTheDocs via pip (MDAnalysis#3071) Install via `pip install package/` to build current docs on ReadTheDocs * try stringio * rm metals file * pin pytest * pin pytest on gh actions * Fixes RMSF docstring (Issue MDAnalysis#2806) (MDAnalysis#3033) Fixes the RMSF docstring's align command and adds transformation to make the results accurate * MAINT: simplify guessers regex (MDAnalysis#3085) * the `SYMBOLS` regex in `guessers.py` does not require any escape sequences because the metacharacters are inactive in the character class (this includes the range metacharacter when placed at the start or end of the character class) * MAINT: char class regex improve * avoid the overhead of a regex character class when that character class has only a single character (i.e., serves no purpose) * there is only one instance of this in MDA codebase discovered by my [scraping code](https://github.com/tylerjereddy/regex-improve) * for a longer explanation see my similar changes in NumPy codebase: numpy/numpy#18083 * Fix syntax warning over comparison of literals using is. * Quick fix for atommethods to return empty residue group (MDAnalysis#3089) Returns empty residue group for _get_prev_residues_by_resid and _get_next_residues_by_resid * Add to authors list. Co-authored-by: Lily Wang <31115101+lilyminium@users.noreply.github.com> Co-authored-by: IAlibay <irfan.alibay@gmail.com> Co-authored-by: Tyler Reddy <tyler.je.reddy@gmail.com> Co-authored-by: Lily Wang <lily@minium.com.au> Co-authored-by: Irfan Alibay <IAlibay@users.noreply.github.com> Co-authored-by: Oliver Beckstein <orbeckst@gmail.com> Co-authored-by: Karthikeyan Singaravelan <tir.karthi@gmail.com> Co-authored-by: Aditya Kamath <48089312+aditya-kamath@users.noreply.github.com>

Fixes MDAnalysis#2925 Fixes MDAnalysis#2875 Fixes MDAnalysis#3054 Changes made in this Pull Request: - added a class factory to subclass `core.selection.Selection` for each TopologyAttr - added tokens to `core.selection.SameSelection` - added `FloatRangeSelection` and `BoolSelection` - added negatives, scientific notation and "to" delimiter for ranges

orbeckst added the Component-Selections label Jul 28, 2020

orbeckst mentioned this issue Jul 30, 2020

Prioritising the chain ID for segment id #2874

Closed

lilyminium mentioned this issue Sep 1, 2020

Add arbitrary TopologyAttr selection #2927

Merged

4 tasks

lilyminium closed this as completed in #2927 Dec 10, 2020

yuyuan871111 mentioned this issue Mar 11, 2025

Add force read chain to segment when reading a PDB file / set groups by ids #4948

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow select_atoms to select chain #2875

Allow select_atoms to select chain #2875

xiki-tempula commented Jul 28, 2020 •

edited

Loading

orbeckst commented Jul 30, 2020 •

edited

Loading

Uh oh!

Allow select_atoms to select chain #2875

Allow select_atoms to select chain #2875

Comments

xiki-tempula commented Jul 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Is your feature request related to a problem?

orbeckst commented Jul 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Chain vs Segment

chain

Segment

selection keyword

VMD

CHARMM

PyMOL

mdtraj

Uh oh!

xiki-tempula commented Jul 28, 2020 •

edited

Loading

orbeckst commented Jul 30, 2020 •

edited

Loading