Skip to content

Document charge assignment hierarchy #68

Description

@mattwthompson

The OpenFF Toolkit/Interchange has/had long-standing implementations of hierarchical charge assignment, in particular that charges are assigned from methods in the following order

  1. Look for molecule matches in the charge_from_molecules kwarg
  2. Look for chemical environment matches in library charges
  3. Look for chemical environment matches in charge increments
  4. Try to run some variant of AM1-BCC (presumably this is were graph charges go)

This is in the spec ... but also not presented in a helpful way. I found (detailed below the fold) hints at this hierarchy in each section, but it'd be really convenient to users to have this hierarchy spelled out. I happen to know this behavior because of my familiarity with the implementation, but I'd be really confused if I was a new user trying to figure out which of the many possible charge assignment methods was actually happening. (Or see another failure case below.)

I think, maybe under this section header, a short mention of the above list would be helpful to users. Most of this information is buried in the other sections but not presented in a way that makes it explicit.

As a somewhat problematic example, imagine a user has a polymer of some sort in which AM1 is not practical, there aren't pre-specific charges for charge_from_molecules, there also isn't a charge increment model, and they made a typo in a large SMIRKS pattern describing the library charges of a residue. A scientist is probably only expecting pre-specific or library charges to be used on macromolecules, so our software hanging (the result of firing up AM1) is certainly useless and arguably surprising. This could be worked around, and the toolkit tries (see below) to warn users if AM1-BCC is called on large molecules, but the result in this scenario is confusion as to why AM1-BCC is even in the picture.

>>> from openff.toolkit import Molecule
>>> Molecule.from_smiles(200 * "C").assign_partial_charges(partial_charge_method="am1bcc")
<stdin>:1: UserWarning: Warning! Partial charge method 'am1bcc' is not designed for use on large (i.e. > 150 atoms) molecules and may crash or take hours to run on this molecule (found 602 atoms). For more, see https://docs.openforcefield.org/projects/toolkit/en/stable/faq.html#parameterizing-my-system-which-contains-a-large-molecule-is-taking-forever-whats-wrong

Library charge section

Library charges are applied first, and atoms for which library charges are applied will be excluded from alternative charging schemes listed below.

Note that this isn't true, since charge_from_molecules takes precedence over library charges.

Charge increments section

Note that atoms for which library charges have already been applied are excluded from charging via <ChargeIncrementModel>.

AM1-BCC section

Note that atoms for which prespecified or library charges have already been applied are excluded from charging via <ToolkitAM1BCC>.

“Prespecified charges” section doesn’t even communicate “if you set charges this way, all other methods are skipped” and I really think it should


Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions