Skip to content

MAINT: rounding residue partial charges #5035

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions package/MDAnalysis/core/topologyattrs.py
Original file line number Diff line number Diff line change
Expand Up @@ -2146,6 +2146,11 @@ class Charges(AtomAttr):
transplants = defaultdict(list)
dtype = float


def __init__(self, values, guessed=False, round_val=None):
super().__init__(values, guessed)
self.round = round_val

@staticmethod
def _gen_initial_values(na, nr, ns):
return np.zeros(na)
Expand All @@ -2160,6 +2165,9 @@ def get_residues(self, rg):
for i, row in enumerate(resatoms):
charges[i] = self.values[row].sum()

if self.round is not None:
charges = np.round(charges, self.round)

return charges

def get_segments(self, sg):
Expand Down
7 changes: 6 additions & 1 deletion package/MDAnalysis/topology/TOPParser.py
Original file line number Diff line number Diff line change
Expand Up @@ -453,7 +453,12 @@ def parse_charges(self, num_per_record, numlines):
vals = self.parsesection_mapper(numlines, lambda x: float(x))
charges = np.array(vals, dtype=np.float32)
charges /= 18.2223 # to electron charge units
attr = Charges(charges)
# in practice, the precision of partial charges
# in MD simulation forcefields is typically limited
# to a few decimal places (i.e., 1e-4 is already optimistic)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a citation perhaps, but only if there's an appetite for moving forward

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our recent work shows that changes on the order of 1e-3 can have large impacts on the free energy. I've seen (unreported) cases where 5e-4 does subtle changes, probably 1e-5 is the best bet?

Probably @lilyminium is the best person to weigh in on this though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some quick thoughts below, apologies I've just skimmed this issue/PR and I'm in the middle of travel.

For the same reasons Oliver and Irfan have mentioned I'm not super keen on auto-rounding, especially to something as low as 4 dp -- just to throw out an example, OpenFF's FFs will assign charges with more dp than that and can be written out to AMBER formats (and we ensure integer charge at the molecule level but not necessarily residue level). Could rounding be a kwarg that's passed to the TOPParser instead? Then I'd be 100% in favour of having this capability in the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just quick comment, sorry to be brief, Lily: I don't think that a TOPparser argument is the answer. If we round atom charges on reading the topology, there's still no guarantee that they sum to integer values for each residue.

However, building in a rounding argument into any of our functions that aggregate (such as total_charge(decimals=N) (modelled after np.round() makes sense to me.

I don't see how to do this transparently for managed attributes such as residues.charges without maintaining some per-Universe state — which I don't like because something like Universe(..., decimals=4) is ambiguous (where do we round?) and also makes downstream analysis code dependent on how a universe was created. At the end of the day, I find np.round(residues.charges, decimals=4) a very clear way to express what the specific use case is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the clarification Oliver -- I was just looking at the changed lines in the PR where the kwarg is passed into the Charge attr on top-parsing. So as I understand it atom charges aren't rounded in this PR, but the residue-rounding is set on parsing. I agree that's not super intuitive and that making the user just round charges themselves is probably easier and clearer.

# see gh-5032
# here, we round the residue net charges to 4 decimal places
attr = Charges(charges, round_val=4)
return attr

def parse_masses(self, num_per_record, numlines):
Expand Down
28 changes: 27 additions & 1 deletion testsuite/MDAnalysisTests/topology/test_top.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
import MDAnalysis as mda
import numpy as np
import pytest
from numpy.testing import assert_equal
from numpy.testing import assert_equal, assert_allclose

from MDAnalysisTests.datafiles import PRM # ache.prmtop
from MDAnalysisTests.datafiles import PRM7 # tz2.truncoct.parm7.bz2
Expand Down Expand Up @@ -837,3 +837,29 @@ def test_warning(self, parm, errmsgs):

for msg in errmsgs:
assert any(msg in recmsg for recmsg in messages)


def test_gh_5032():
u = mda.Universe(PRM)
# check individual residue.charge
actual = []
for res in u.residues:
actual.append(res.charge)
expected = [1.0, # N-terminal A
-1.0, # E
0.0, # F
0.0, # H (HIE)
1.0, # R
0.0, # W
0.0, # S
0.0, # S
0.0, # Y
0.0, # M
0.0, # V
0.0, # H (HIE)
0.0, # W
0.0] # C-terminal K
assert_allclose(actual, expected)
# check residues.charges, which are set
# separately
assert_allclose(u.residues.charges, expected)
Loading