MAINT: rounding residue partial charges #5035

tylerjereddy · 2025-04-18T17:19:40Z

Fixes Query: amino acid residue charges #5032.
Given that 1e-4 is already optimistic for electron charge unit precision in MD forcefields, this patch proposes to round residue net charges to 4 decimal places to avoid the confusion described in the matching ticket.
Some obvious questions might include: why are we only doing this for AMBER topologies? why only round at the residue level and not i.e., upstream at atom level and also at "segment" levels? This is mostly for scoping, because this initial patch is targeted at the specific issue raised, and because it probably doesn't make sense to proceed much farther until things are discussed.

PR Checklist

Issue raised/referenced?
Tests updated/added?
Documentation updated/added?
package/CHANGELOG file updated?
Is your name in package/AUTHORS? (If it is not, add it!)

Developers Certificate of Origin

I certify that I can submit this code contribution as described in the Developer Certificate of Origin, under the MDAnalysis LICENSE.

📚 Documentation preview 📚: https://mdanalysis--5035.org.readthedocs.build/en/5035/

* Fixes MDAnalysisgh-5032. * Given that 1e-4 is already optimistic for electron charge unit precision in MD forcefields, this patch proposes to round residue net charges to 4 decimal places to avoid the confusion described in the matching ticket. * Some obvious questions might include: why are we only doing this for AMBER topologies? why only round at the residue level and not i.e., upstream at atom level and also at "segment" levels? This is mostly for scoping, because this initial patch is targeted at the specific issue raised, and because it probably doesn't make sense to proceed much farther until things are discussed.

tylerjereddy · 2025-04-18T17:20:07Z

package/MDAnalysis/topology/TOPParser.py

-        attr = Charges(charges)
+        # in practice, the precision of partial charges
+        # in MD simulation forcefields is typically limited
+        # to a few decimal places (i.e., 1e-4 is already optimistic)


needs a citation perhaps, but only if there's an appetite for moving forward

Our recent work shows that changes on the order of 1e-3 can have large impacts on the free energy. I've seen (unreported) cases where 5e-4 does subtle changes, probably 1e-5 is the best bet?

Probably @lilyminium is the best person to weigh in on this though.

Just some quick thoughts below, apologies I've just skimmed this issue/PR and I'm in the middle of travel.

For the same reasons Oliver and Irfan have mentioned I'm not super keen on auto-rounding, especially to something as low as 4 dp -- just to throw out an example, OpenFF's FFs will assign charges with more dp than that and can be written out to AMBER formats (and we ensure integer charge at the molecule level but not necessarily residue level). Could rounding be a kwarg that's passed to the TOPParser instead? Then I'd be 100% in favour of having this capability in the code.

Just quick comment, sorry to be brief, Lily: I don't think that a TOPparser argument is the answer. If we round atom charges on reading the topology, there's still no guarantee that they sum to integer values for each residue.

However, building in a rounding argument into any of our functions that aggregate (such as total_charge(decimals=N) (modelled after np.round() makes sense to me.

I don't see how to do this transparently for managed attributes such as residues.charges without maintaining some per-Universe state — which I don't like because something like Universe(..., decimals=4) is ambiguous (where do we round?) and also makes downstream analysis code dependent on how a universe was created. At the end of the day, I find np.round(residues.charges, decimals=4) a very clear way to express what the specific use case is.

I see, thanks for the clarification Oliver -- I was just looking at the changed lines in the PR where the kwarg is passed into the Charge attr on top-parsing. So as I understand it atom charges aren't rounded in this PR, but the residue-rounding is set on parsing. I agree that's not super intuitive and that making the user just round charges themselves is probably easier and clearer.

codecov · 2025-04-18T17:28:48Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.63%. Comparing base (e64755c) to head (36662df).

Additional details and impacted files

@@           Coverage Diff            @@
##           develop    #5035   +/-   ##
========================================
  Coverage    93.62%   93.63%           
========================================
  Files          177      177           
  Lines        21978    21983    +5     
  Branches      3110     3111    +1     
========================================
+ Hits         20578    20583    +5     
  Misses         946      946           
  Partials       454      454

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

orbeckst · 2025-05-09T16:54:23Z

To start a discussion here: I am generally not too keen on altering data, e.g., rounding. I'd rather have the user decide what they want to do with it. I feel that the described use case – from what I gathered — was a somewhat direct use of numerical data without further consideration of the underlying meaning of the numbers.

I'd be strictly against rounding the atomic charges, which are taken directly from the input data, and should be presented as close to their original representation as possible — regardless how ridiculous we might find a charge spelled out to 8 decimals. (I understand that this is not the proposal here, I just wanted to make my stance clear.)

The per-residue total charge is something that MDAnalysis computes so I acknowledge that we do have the freedom to decide how we want to present it. I'd be more comfortable with giving the user the option to round — mainly that allows us to write a doc entry to say that "non-integer charges can happen and if you don't like it, set round=4" or something like that. However, personally I'd keep the default at no rounding.

It would be good to hear others. This also impinges on the "consistent dtype across the universe" plan that @hmacdope in particular has been pursuing.

tylerjereddy · 2025-05-09T17:11:49Z

Got it, thanks. I don't have a strong view, was just passing this concern along from the scientists. Maybe I'll wait for others to chime in before adjusting to offer an option to round (or just documenting that folks may want to round on their end, whatever the team prefers).

tylerjereddy · 2025-05-17T20:58:07Z

Is this converging on a docs-only suggestion that rounding may be needed to get sensible charges for some user applications?

I'm skeptical this will do much to prevent the original footgun I described. What may be more convenient for our API doesn't really do much for improving end user correctness.

That said, if that's the consensus I'll try to add a few sentences in the docs, but seems pretty low value/low visibility.

tylerjereddy · 2025-05-17T21:00:27Z

Also, I think Irfan and I are in favor of rounding to some degree (or offering the API option to), while Oli and Lily are not, so this is currently somewhat tied, no?

tylerjereddy · 2025-05-17T21:01:24Z

I think the fact that the devs are disagreeing a bit here also somewhat reinforces the potential for a less experienced user to experience a footgun.

orbeckst · 2025-05-18T22:30:43Z

I think it should be a user choice to do any rounding. The whole work on the guessers has been driven by reducing well-meaning but not always correct choices. The idea of dtypes throughout wants to make the precision of operations transparent — admittedly, I’m not sure when this will get done, but we have it on the road map. Building in rounding by default does not fit in well with the direction in which we wanted to go.

How would you want to give the user the option? Could this be done with a guesser (eg something that has the assumption built in that residues have integer charges) so that these assumptions are made clear and require a conscientious choice? That would also help with reproducibility, I think.

IAlibay · 2025-05-18T22:43:29Z

In this particular case, where the issue is down to the idea that floats should sum to an int, how about we implement a residue / segment / etc level "formal_charge" attribute that either sums the per atom formal charges (if available) or approximates them from the partial charges if the sum is within a tolerance of an int? (edit: if it doesn't turn return a NaN)

orbeckst · 2025-05-18T23:22:34Z

That may be a viable way forward!Am 5/18/25 um 18:43 schrieb Irfan Alibay ***@***.***>: IAlibay left a comment (MDAnalysis/mdanalysis#5035) In this particular case, where the issue is down to the idea that floats should sum to an int, how about we implement a residue / segment / etc level "formal_charge" attribute that either sums the per atom formal charges (if available) or approximates them from the partial charges if the sum is within a tolerance of an int? —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>

tylerjereddy · 2025-05-19T02:49:10Z

User choice and/or separate attribute that auto-rounds seems ok.

tylerjereddy added maintainability Format-Amber labels Apr 18, 2025

tylerjereddy commented Apr 18, 2025

View reviewed changes

orbeckst mentioned this pull request May 9, 2025

Query: amino acid residue charges #5032

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MAINT: rounding residue partial charges #5035

MAINT: rounding residue partial charges #5035

Uh oh!

tylerjereddy commented Apr 18, 2025 •

edited by github-actions bot

Loading

Uh oh!

tylerjereddy Apr 18, 2025

Uh oh!

IAlibay May 13, 2025

Uh oh!

lilyminium May 14, 2025

Uh oh!

orbeckst May 14, 2025

Uh oh!

lilyminium May 17, 2025

Uh oh!

codecov bot commented Apr 18, 2025 •

edited

Loading

Uh oh!

orbeckst commented May 9, 2025 •

edited

Loading

Uh oh!

tylerjereddy commented May 9, 2025

Uh oh!

tylerjereddy commented May 17, 2025

Uh oh!

tylerjereddy commented May 17, 2025

Uh oh!

tylerjereddy commented May 17, 2025

Uh oh!

orbeckst commented May 18, 2025

Uh oh!

IAlibay commented May 18, 2025 •

edited

Loading

Uh oh!

orbeckst commented May 18, 2025 via email

Uh oh!

tylerjereddy commented May 19, 2025

Uh oh!

Uh oh!

MAINT: rounding residue partial charges #5035

Are you sure you want to change the base?

MAINT: rounding residue partial charges #5035

Uh oh!

Conversation

tylerjereddy commented Apr 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Checklist

Developers Certificate of Origin

Uh oh!

tylerjereddy Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

IAlibay May 13, 2025

Choose a reason for hiding this comment

Uh oh!

lilyminium May 14, 2025

Choose a reason for hiding this comment

Uh oh!

orbeckst May 14, 2025

Choose a reason for hiding this comment

Uh oh!

lilyminium May 17, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

orbeckst commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tylerjereddy commented May 9, 2025

Uh oh!

tylerjereddy commented May 17, 2025

Uh oh!

tylerjereddy commented May 17, 2025

Uh oh!

tylerjereddy commented May 17, 2025

Uh oh!

orbeckst commented May 18, 2025

Uh oh!

IAlibay commented May 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

orbeckst commented May 18, 2025 via email

Uh oh!

tylerjereddy commented May 19, 2025

Uh oh!

Uh oh!

tylerjereddy commented Apr 18, 2025 •

edited by github-actions bot

Loading

codecov bot commented Apr 18, 2025 •

edited

Loading

orbeckst commented May 9, 2025 •

edited

Loading

IAlibay commented May 18, 2025 •

edited

Loading