Skip to content

Sync with steven#27

Open
ftclark3 wants to merge 15 commits into
cabb99:masterfrom
ftclark3:sync_with_steven
Open

Sync with steven#27
ftclark3 wants to merge 15 commits into
cabb99:masterfrom
ftclark3:sync_with_steven

Conversation

@ftclark3

@ftclark3 ftclark3 commented May 28, 2026

Copy link
Copy Markdown

Talked with @stevenluo22 about how we can merge his edits into this branch and this was the result

Summary by Sourcery

Add configurable electrostatics and biasing forces for protein–DNA and DNA-only systems, and introduce scripts to set up, run, and analyze protein–DNA OpenMM simulations.

New Features:

  • Allow overriding radii and cutoffs in protein–DNA exclusion and electrostatics forces, including configurable Debye length and cutoff distances.
  • Introduce biasing and measurement forces for protein–DNA systems, including electrostatic bias, base-pair–protein harmonic biases, and string potentials that can optionally report distances instead of energies.
  • Add DNA-only positional restraint and distance-measurement collective-variable forces based on group centers of mass.
  • Provide command-line scripts to set up forces, run MD, and compute energies and analysis for protein–DNA simulations, along with a helper to compute DNA twist angles.

Enhancements:

  • Refine protein–DNA string potential to support configurable force groups and an option to compute only intergroup distance, consolidating a separate length-only class into this behavior.
  • Extend DNA electrostatics to support user-specified Debye length and cutoff distance with diagnostic logging of screening parameters.

@sourcery-ai

sourcery-ai Bot commented May 28, 2026

Copy link
Copy Markdown

Reviewer's Guide

Adds tunable electrostatics/exclusion parameters, several new protein–DNA bias and measurement force classes, and a set of driver/analysis scripts around protein–DNA simulations, while slightly generalizing the existing string potential API.

Flow diagram for new protein–DNA simulation and analysis scripts

flowchart LR
    run["protein_DNA_run.py"]
    energy["protein_DNA_energy.py"]
    analysis["protein_DNA_analysis.py"]
    forces["forces_setup.py (set_up_forces)"]
    system["OpenMM System + Simulation"]
    traj["output.dcd trajectory"]

    run --> forces
    energy --> forces

    forces --> system

    run -->|runs MD, writes| traj
    energy -->|minimization/short MD, logs energies| system

    analysis --> forces
    analysis -->|reads| traj
    analysis -->|recomputes energies via Simulation| system
Loading

File-Level Changes

Change Details Files
Make protein–DNA exclusion and electrostatic forces configurable via constructor parameters instead of hard‑coded constants.
  • Add optional radius_override and cutoff parameters to ExclusionProteinDNA and propagate them into the OpenMM CustomNonbondedForce setup.
  • Add optional ldby (Debye length) and cutoff_distance parameters to ElectrostaticsProteinDNA and use them when constructing the electrostatic CustomNonbondedForce, maintaining backward‑compatible defaults.
  • Inject basic diagnostic printing of cutoff/screening parameters during force reset.
open3SPN2/force/protein_dna.py
Introduce new protein–DNA bias/measurement force classes for electrostatic biasing and protein–base‑pair geometry constraints, and extend the existing string force to optionally report distance instead of energy.
  • Add BiasElectrostaticsProteinDNA to wrap ElectrostaticsProteinDNA in a CustomCVForce harmonic bias on electrostatic energy.
  • Add proteinBasePairGroupsHarmonicBias and proteinBasePairGroupsPosition, which build CustomCentroidBondForce expressions tying protein position to base‑pair groups along the DNA.
  • Extend StringProteinDNA to accept a force_group and a compute_distance_not_energy flag that switches the CustomCentroidBondForce between harmonic and pure distance, and remove the now‑redundant String_length_ProteinDNA class.
  • Tighten StringProteinDNA group selection logic to a single protein and DNA chain each, with optional segmenting via group indices.
open3SPN2/force/protein_dna.py
Make DNA–DNA electrostatics configurable and add position‑based collective variable forces for DNA center‑of‑mass restraint/measurement.
  • Extend Electrostatics in dna.py to accept optional ldby and cutoff_distance parameters, overriding the derived Debye length and default cutoff when provided.
  • Add group_constraint_by_position, a DNAForce that creates a harmonic CustomCVForce on the (mass‑weighted) center of mass of selected DNA atoms toward a target position.
  • Add measure_from_position, a DNAForce that defines a CustomCVForce reporting the squared distance of a DNA COM from a reference point for measurement without bias.
  • Print basic diagnostics for DNA electrostatics screening length and cutoff on reset.
open3SPN2/force/dna.py
Add driver scripts to set up, run, and analyze protein–DNA simulations using open3SPN2 and openAWSEM forces.
  • Introduce protein_DNA_energy.py to construct a protein–DNA system from a PDB, load a configurable forces_setup.py, run energy minimizations, and write energy breakdowns and snapshots.
  • Introduce protein_DNA_run.py to support MD runs (constant‑T or annealing) with checkpointing, reporters, and post‑run analysis invocation.
  • Introduce protein_DNA_analysis.py to replay a trajectory, reconstruct forces via forces_setup.py, and output per‑frame energies for named force groups.
  • Introduce forces_setup.py as a shared force construction module wiring standard open3SPN2 DNA forces, protein–DNA forces, and openAWSEM protein forces (including fragment and Debye–Hückel terms) into a System, with optional Q/Rg analysis forces.
open3SPN2/scripts/protein_DNA_energy.py
open3SPN2/scripts/protein_DNA_run.py
open3SPN2/scripts/protein_DNA_analysis.py
open3SPN2/scripts/forces_setup.py
Add a small utility module for computing DNA twist using several vector‑based geometric definitions.
  • Implement vector helper routines (add, scale, norms, dot/cross products) and three twist calculation functions (Xun_twist_OG, Xun_twist, Steven_twist), plus helpers to compare them.
  • Provide a script entry point placeholder for manual testing of twist functions.
open3SPN2/scripts/compute_twist.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 5 issues, and left some high level feedback:

  • Several core force classes and utilities now print debugging information directly (e.g., cutoff distances, k_ebias, centers) on reset/defineInteraction; consider removing or gating these behind a verbosity flag or logger to avoid noisy output in production runs.
  • In multiple places you compare to None with '== None' (e.g., radius_override, cutoff_distance) and sometimes assume unit-bearing vs plain numeric values (e.g., cutoff_distance handling); using 'is None' and consistently enforcing/validating units at the API boundary would make the new parameters safer and less error-prone.
  • forces_setup.py currently hardcodes AWSEM_folder and fragment paths and imports every term unconditionally; making these configurable (e.g., via function arguments or environment variables) and avoiding project-specific defaults would improve reuse of this helper in different environments.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Several core force classes and utilities now print debugging information directly (e.g., cutoff distances, k_ebias, centers) on reset/defineInteraction; consider removing or gating these behind a verbosity flag or logger to avoid noisy output in production runs.
- In multiple places you compare to None with '== None' (e.g., radius_override, cutoff_distance) and sometimes assume unit-bearing vs plain numeric values (e.g., cutoff_distance handling); using 'is None' and consistently enforcing/validating units at the API boundary would make the new parameters safer and less error-prone.
- forces_setup.py currently hardcodes AWSEM_folder and fragment paths and imports every term unconditionally; making these configurable (e.g., via function arguments or environment variables) and avoiding project-specific defaults would improve reuse of this helper in different environments.

## Individual Comments

### Comment 1
<location path="open3SPN2/force/protein_dna.py" line_range="129-134" />
<code_context>
+        
         ldby = ldby.in_units_of(unit.nanometer)
+
+        if self.cutoff_distance == None:
+            cutoff_distance = 5 # for backward compatibility
+        else:
+            cutoff_distance = self.cutoff_distance
+
+        cutoff_nm = cutoff_distance.value_in_unit(unit.nanometer)
+
         denominator = 4 * np.pi * pv * dielectric / (Na * ec ** 2)
</code_context>
<issue_to_address>
**issue (bug_risk):** cutoff_distance default leads to unitless value and will break value_in_unit()

Here `cutoff_distance` becomes a bare float when `self.cutoff_distance is None`, so `cutoff_distance.value_in_unit(...)` will raise `AttributeError`. To preserve units and backward compatibility, make the default a `Quantity` (e.g. `self.cutoff_distance = 5*unit.nanometer` in `__init__`) and keep it as a `Quantity` in the fallback, or set `cutoff_distance = 5*unit.nanometer` here instead of `5`. Also check any other paths that might pass a plain float into this code.
</issue_to_address>

### Comment 2
<location path="open3SPN2/force/dna.py" line_range="518-523" />
<code_context>
+        
         ldby = ldby.in_units_of(unit.nanometer)
+
+        if self.cutoff_distance == None:
+            cutoff_distance = 5 # for backward compatibility
+        else:
+            cutoff_distance = self.cutoff_distance
+
+        cutoff_nm = cutoff_distance.value_in_unit(unit.nanometer)
+
         denominator = 4 * np.pi * pv * dielectric / (Na * ec ** 2)
</code_context>
<issue_to_address>
**issue (bug_risk):** Electrostatics cutoff_distance has same unitless-default issue as protein-DNA electrostatics

Because the fallback uses a bare float (`5`), calling `cutoff_distance.value_in_unit(unit.nanometer)` will fail unless `cutoff_distance` is always an OpenMM Quantity. To align with the protein–DNA electrostatics fix and keep backward compatibility, either:

- set the `__init__` default to `cutoff_distance = 5*unit.nanometer`, or
- in `reset`, use `cutoff_distance = 5*unit.nanometer` in the backward-compatible branch.

This preserves the existing API while preventing runtime errors.
</issue_to_address>

### Comment 3
<location path="open3SPN2/force/dna.py" line_range="568-577" />
<code_context>
+class group_constraint_by_position(DNAForce):
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Constraint uses sum of mass-weighted coordinates, not center of mass as advertised

The current CV uses `sum(m_i x_i)` directly, so the effective reference position scales with the number and masses of atoms instead of being invariant like a true COM (which would be `sum(m_i x_i) / sum(m_i)`). To match the documented behavior, either:
- divide each coordinate sum by `total_mass` in the `CustomCVForce` (e.g., `(sum_x/total_mass - x0)^2`), or
- normalize the per-particle weights to `mass/total_mass` and keep the existing expression.

Aligning the implementation with the docstring will prevent downstream confusion and unintended parameter changes.

Suggested implementation:

```python
class group_constraint_by_position(DNAForce):
    """
    Apply a harmonic restraint on the center of mass of selected DNA atoms to keep them near (x0, y0, z0).

    The actual CV is the center-of-mass (COM) position:
        COM = sum_i(m_i * r_i) / sum_i(m_i)
    not just the raw mass-weighted position sum_i(m_i * r_i).
    """

    def __init__(self, dna, k=1*unit.kilocalorie_per_mole, 
                 x0=10*unit.angstrom, y0=10*unit.angstrom, z0=10*unit.angstrom, 
                 appliedToResidues=None, force_group=24, OpenCLPatch=True):

        # Store configuration
        self.dna = dna
        self.force_group = force_group
        self.k = k
        self.x0 = x0
        self.y0 = y0
        self.z0 = z0
        self.appliedToResidues = appliedToResidues
        self.OpenCLPatch = OpenCLPatch

        # Build the OpenMM force object
        self._create_force()

    def _create_force(self):
        """
        Create a CustomCVForce that restrains the center of mass of the selected atoms.
        """
        import openmm as mm
        from openmm import unit as omm_unit

        system = self.dna.system
        topology = self.dna.topology

        # Collect the particle indices that belong to the selected residues
        selected_particle_indices = []
        if self.appliedToResidues is None:
            # Default: all DNA atoms; assumes dna.atom_list or equivalent is available.
            # Adapt this selection logic to your existing DNAForce conventions.
            for atom in topology.atoms():
                if atom.residue.name.startswith("D") or atom.residue.name.startswith("R"):
                    selected_particle_indices.append(atom.index)
        else:
            residues_set = set(self.appliedToResidues)
            for atom in topology.atoms():
                if atom.residue.index in residues_set:
                    selected_particle_indices.append(atom.index)

        if len(selected_particle_indices) == 0:
            raise ValueError("group_constraint_by_position: no atoms selected for COM restraint.")

        # Compute total mass of the group
        masses = [system.getParticleMass(i) for i in selected_particle_indices]
        total_mass = sum(masses)  # this is an OpenMM unit-bearing quantity

        # Create per-coordinate CVs: sum(m_i * x_i), sum(m_i * y_i), sum(m_i * z_i)
        # These are still mass-weighted sums; we convert to COM by dividing by total_mass in the main CV expression.
        sum_x_force = mm.CustomCVForce("sum(mass * x)")
        sum_y_force = mm.CustomCVForce("sum(mass * y)")
        sum_z_force = mm.CustomCVForce("sum(mass * z)")

        sum_x_force.addPerParticleParameter("mass")
        sum_y_force.addPerParticleParameter("mass")
        sum_z_force.addPerParticleParameter("mass")

        for idx, m in zip(selected_particle_indices, masses):
            sum_x_force.addParticle(idx, [m])
            sum_y_force.addParticle(idx, [m])
            sum_z_force.addParticle(idx, [m])

        # Main COM restraint CV:
        # Use COM = sum(m_i * coord_i) / total_mass for each coordinate
        com_restraint = mm.CustomCVForce(
            "0.5 * k * ( (sum_x/total_mass - x0)^2 + (sum_y/total_mass - y0)^2 + (sum_z/total_mass - z0)^2 )"
        )

        # Add the collective variables that provide sum_x, sum_y, sum_z
        com_restraint.addCollectiveVariable("sum_x", sum_x_force)
        com_restraint.addCollectiveVariable("sum_y", sum_y_force)
        com_restraint.addCollectiveVariable("sum_z", sum_z_force)

        # Global parameters: force constant, reference COM position, and total mass of the group
        # Note: total_mass is constant for this group; exposing it as a global parameter
        # makes the COM definition explicit and invariant to group size.
        com_restraint.addGlobalParameter("k", self.k)
        com_restraint.addGlobalParameter("x0", self.x0)
        com_restraint.addGlobalParameter("y0", self.y0)
        com_restraint.addGlobalParameter("z0", self.z0)
        com_restraint.addGlobalParameter("total_mass", total_mass)

        com_restraint.setForceGroup(self.force_group)

        # Register this force with the DNAForce infrastructure
        self.force = com_restraint
        self.addForce(self.force, self.OpenCLPatch)

```

1. The selection logic for `selected_particle_indices` must be aligned with how other `DNAForce` subclasses in this file choose their atoms (e.g., there might already be helper methods or precomputed atom lists you should reuse instead of the placeholder `topology`-based loop).
2. The current implementation of `group_constraint_by_position` in the PR likely already constructs `CustomCVForce` instances for `sum(m_i * x_i)` etc.; if so, you can instead **only** change:
   - the main CV expression from `(sum_x - x0)^2` to `(sum_x/total_mass - x0)^2` (and similarly for y, z), and  
   - add a global parameter `total_mass` initialized to the Python-computed total mass for the group.
3. Ensure that any existing calls to `addForce` or class initialization in `DNAForce` are preserved; if `DNAForce` expects a different initialization pattern (e.g., overriding `setup()` rather than calling `_create_force()` in `__init__`), adapt `_create_force()` to that pattern instead of calling `addForce` directly as shown.
</issue_to_address>

### Comment 4
<location path="open3SPN2/force/dna.py" line_range="614-623" />
<code_context>
+
+        self.force = harmonic
+
+    def defineInteraction(self):
+        """
+        Adds selected DNA atoms to the sum_x, sum_y, sum_z forces with appropriate mass weighting.
+        """
+        total_mass = 0.0
+        for i in range(len(self.dna.atoms)):
+            #mass = self.dna.system.getParticleMass(i).value_in_unit(unit.dalton)
+            mass = 1
+
+            # Add particle to sum_x, sum_y, sum_z
+            self.force.getCollectiveVariable(0).addParticle(i, [mass])
+            self.force.getCollectiveVariable(1).addParticle(i, [mass])
+            self.force.getCollectiveVariable(2).addParticle(i, [mass])
+            
+            total_mass += mass
+
+        if total_mass == 0:
</code_context>
<issue_to_address>
**suggestion (bug_risk):** appliedToResidues and actual masses are ignored in group_constraint_by_position

In `defineInteraction`, all DNA atoms are added and `mass = 1` is used for every particle (with the `getParticleMass` call commented out). If residue-based selection is meant to matter, the indices should be filtered using `appliedToResidues` before adding them to the CVs. Also, using unit mass for all atoms turns the CV into an unweighted average position rather than a true COM. Please either restore `getParticleMass` (or equivalent) or clearly document that this CV is intentionally unweighted, and ensure `appliedToResidues` is respected when building the index list.

Suggested implementation:

```python
        self.force = harmonic

    def defineInteraction(self):
        """
        Adds selected DNA atoms to the sum_x, sum_y, sum_z collective variables with mass weighting
        to compute a true center-of-mass (COM). If ``appliedToResidues`` is provided, only atoms
        belonging to those residues are included.
        """
        total_mass = 0.0

        # Build the list of atom indices to include, honoring appliedToResidues if provided.
        if self.appliedToResidues is not None:
            atom_indices = []
            for i, atom in enumerate(self.dna.atoms):
                # Assumes atoms expose a residue identifier as `resid`.
                # If your topology uses a different attribute, update this accordingly.
                resid = getattr(atom, "resid", None)
                if resid in self.appliedToResidues:
                    atom_indices.append(i)
        else:
            atom_indices = range(len(self.dna.atoms))

        for i in atom_indices:
            # Use the actual particle mass so the CV represents a true COM, not an unweighted average.
            mass = self.dna.system.getParticleMass(i).value_in_unit(unit.dalton)

            # Add particle to sum_x, sum_y, sum_z with its mass as the weight
            self.force.getCollectiveVariable(0).addParticle(i, [mass])
            self.force.getCollectiveVariable(1).addParticle(i, [mass])
            self.force.getCollectiveVariable(2).addParticle(i, [mass])

            total_mass += mass

        if total_mass == 0:
            raise ValueError("No atoms were selected for the group constraint; check appliedToResidues.")

```

1. Ensure that the atom objects in `self.dna.atoms` expose the residue identifier as `atom.resid`. If a different attribute is used (e.g. `atom.residue.id` or `atom.residue.index`), adjust the `resid = getattr(atom, "resid", None)` line and the subsequent membership test accordingly.
2. Confirm that `unit` is imported in this module (e.g. `from openmm import unit` or `import openmm.unit as unit`) so that `value_in_unit(unit.dalton)` works as expected.
3. If the codebase already has a helper for residue-based atom selection (e.g. on `DNAForce` or `dna`), you may prefer to replace the explicit loop that builds `atom_indices` with that helper to keep residue-selection logic consistent across forces.
</issue_to_address>

### Comment 5
<location path="open3SPN2/force/protein_dna.py" line_range="39-42" />
<code_context>
             exclusionForce.setNonbondedMethod(exclusionForce.CutoffPeriodic)
         else:
             exclusionForce.setNonbondedMethod(exclusionForce.CutoffNonPeriodic)
+        print(f"protein dna cutoff {exclusionForce.getCutoffDistance()}")
         self.force = exclusionForce

</code_context>
<issue_to_address>
**suggestion:** Unconditional debug print statements in force construction may clutter output

These new print calls (e.g., for the protein-DNA cutoff) will fire on every force construction and can create a lot of noisy stdout in larger or batch runs. Please route them through a logger or guard them with a verbosity/debug flag so they can be disabled in production use.

Suggested implementation:

```python
        if self.periodic:
            exclusionForce.setNonbondedMethod(exclusionForce.CutoffPeriodic)
        else:
            exclusionForce.setNonbondedMethod(exclusionForce.CutoffNonPeriodic)
        logger.debug("Protein-DNA cutoff distance: %s", exclusionForce.getCutoffDistance())
        self.force = exclusionForce

```

To make this compile and behave as intended, you should also:

1. Import the `logging` module near the top of `open3SPN2/force/protein_dna.py` and define a module-level logger, for example:
   - `import logging`
   - `logger = logging.getLogger(__name__)`

2. Ensure your application or scripts configure the logging system (e.g., with `logging.basicConfig(...)` or a logging config file) so that debug logs can be enabled/disabled as needed.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread open3SPN2/force/protein_dna.py
Comment thread open3SPN2/force/dna.py
Comment thread open3SPN2/force/dna.py
Comment on lines +568 to +577
class group_constraint_by_position(DNAForce):
"""
Apply a harmonic restraint on the center of mass of selected DNA atoms to keep them near (x0, y0, z0).
"""

def __init__(self, dna, k=1*unit.kilocalorie_per_mole,
x0=10*unit.angstrom, y0=10*unit.angstrom, z0=10*unit.angstrom,
appliedToResidues=None, force_group=24, OpenCLPatch=True):

self.force_group = force_group

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Constraint uses sum of mass-weighted coordinates, not center of mass as advertised

The current CV uses sum(m_i x_i) directly, so the effective reference position scales with the number and masses of atoms instead of being invariant like a true COM (which would be sum(m_i x_i) / sum(m_i)). To match the documented behavior, either:

  • divide each coordinate sum by total_mass in the CustomCVForce (e.g., (sum_x/total_mass - x0)^2), or
  • normalize the per-particle weights to mass/total_mass and keep the existing expression.

Aligning the implementation with the docstring will prevent downstream confusion and unintended parameter changes.

Suggested implementation:

class group_constraint_by_position(DNAForce):
    """
    Apply a harmonic restraint on the center of mass of selected DNA atoms to keep them near (x0, y0, z0).

    The actual CV is the center-of-mass (COM) position:
        COM = sum_i(m_i * r_i) / sum_i(m_i)
    not just the raw mass-weighted position sum_i(m_i * r_i).
    """

    def __init__(self, dna, k=1*unit.kilocalorie_per_mole, 
                 x0=10*unit.angstrom, y0=10*unit.angstrom, z0=10*unit.angstrom, 
                 appliedToResidues=None, force_group=24, OpenCLPatch=True):

        # Store configuration
        self.dna = dna
        self.force_group = force_group
        self.k = k
        self.x0 = x0
        self.y0 = y0
        self.z0 = z0
        self.appliedToResidues = appliedToResidues
        self.OpenCLPatch = OpenCLPatch

        # Build the OpenMM force object
        self._create_force()

    def _create_force(self):
        """
        Create a CustomCVForce that restrains the center of mass of the selected atoms.
        """
        import openmm as mm
        from openmm import unit as omm_unit

        system = self.dna.system
        topology = self.dna.topology

        # Collect the particle indices that belong to the selected residues
        selected_particle_indices = []
        if self.appliedToResidues is None:
            # Default: all DNA atoms; assumes dna.atom_list or equivalent is available.
            # Adapt this selection logic to your existing DNAForce conventions.
            for atom in topology.atoms():
                if atom.residue.name.startswith("D") or atom.residue.name.startswith("R"):
                    selected_particle_indices.append(atom.index)
        else:
            residues_set = set(self.appliedToResidues)
            for atom in topology.atoms():
                if atom.residue.index in residues_set:
                    selected_particle_indices.append(atom.index)

        if len(selected_particle_indices) == 0:
            raise ValueError("group_constraint_by_position: no atoms selected for COM restraint.")

        # Compute total mass of the group
        masses = [system.getParticleMass(i) for i in selected_particle_indices]
        total_mass = sum(masses)  # this is an OpenMM unit-bearing quantity

        # Create per-coordinate CVs: sum(m_i * x_i), sum(m_i * y_i), sum(m_i * z_i)
        # These are still mass-weighted sums; we convert to COM by dividing by total_mass in the main CV expression.
        sum_x_force = mm.CustomCVForce("sum(mass * x)")
        sum_y_force = mm.CustomCVForce("sum(mass * y)")
        sum_z_force = mm.CustomCVForce("sum(mass * z)")

        sum_x_force.addPerParticleParameter("mass")
        sum_y_force.addPerParticleParameter("mass")
        sum_z_force.addPerParticleParameter("mass")

        for idx, m in zip(selected_particle_indices, masses):
            sum_x_force.addParticle(idx, [m])
            sum_y_force.addParticle(idx, [m])
            sum_z_force.addParticle(idx, [m])

        # Main COM restraint CV:
        # Use COM = sum(m_i * coord_i) / total_mass for each coordinate
        com_restraint = mm.CustomCVForce(
            "0.5 * k * ( (sum_x/total_mass - x0)^2 + (sum_y/total_mass - y0)^2 + (sum_z/total_mass - z0)^2 )"
        )

        # Add the collective variables that provide sum_x, sum_y, sum_z
        com_restraint.addCollectiveVariable("sum_x", sum_x_force)
        com_restraint.addCollectiveVariable("sum_y", sum_y_force)
        com_restraint.addCollectiveVariable("sum_z", sum_z_force)

        # Global parameters: force constant, reference COM position, and total mass of the group
        # Note: total_mass is constant for this group; exposing it as a global parameter
        # makes the COM definition explicit and invariant to group size.
        com_restraint.addGlobalParameter("k", self.k)
        com_restraint.addGlobalParameter("x0", self.x0)
        com_restraint.addGlobalParameter("y0", self.y0)
        com_restraint.addGlobalParameter("z0", self.z0)
        com_restraint.addGlobalParameter("total_mass", total_mass)

        com_restraint.setForceGroup(self.force_group)

        # Register this force with the DNAForce infrastructure
        self.force = com_restraint
        self.addForce(self.force, self.OpenCLPatch)
  1. The selection logic for selected_particle_indices must be aligned with how other DNAForce subclasses in this file choose their atoms (e.g., there might already be helper methods or precomputed atom lists you should reuse instead of the placeholder topology-based loop).
  2. The current implementation of group_constraint_by_position in the PR likely already constructs CustomCVForce instances for sum(m_i * x_i) etc.; if so, you can instead only change:
    • the main CV expression from (sum_x - x0)^2 to (sum_x/total_mass - x0)^2 (and similarly for y, z), and
    • add a global parameter total_mass initialized to the Python-computed total mass for the group.
  3. Ensure that any existing calls to addForce or class initialization in DNAForce are preserved; if DNAForce expects a different initialization pattern (e.g., overriding setup() rather than calling _create_force() in __init__), adapt _create_force() to that pattern instead of calling addForce directly as shown.

Comment thread open3SPN2/force/dna.py
Comment on lines +614 to +623
def defineInteraction(self):
"""
Adds selected DNA atoms to the sum_x, sum_y, sum_z forces with appropriate mass weighting.
"""
total_mass = 0.0
for i in range(len(self.dna.atoms)):
#mass = self.dna.system.getParticleMass(i).value_in_unit(unit.dalton)
mass = 1

# Add particle to sum_x, sum_y, sum_z

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): appliedToResidues and actual masses are ignored in group_constraint_by_position

In defineInteraction, all DNA atoms are added and mass = 1 is used for every particle (with the getParticleMass call commented out). If residue-based selection is meant to matter, the indices should be filtered using appliedToResidues before adding them to the CVs. Also, using unit mass for all atoms turns the CV into an unweighted average position rather than a true COM. Please either restore getParticleMass (or equivalent) or clearly document that this CV is intentionally unweighted, and ensure appliedToResidues is respected when building the index list.

Suggested implementation:

        self.force = harmonic

    def defineInteraction(self):
        """
        Adds selected DNA atoms to the sum_x, sum_y, sum_z collective variables with mass weighting
        to compute a true center-of-mass (COM). If ``appliedToResidues`` is provided, only atoms
        belonging to those residues are included.
        """
        total_mass = 0.0

        # Build the list of atom indices to include, honoring appliedToResidues if provided.
        if self.appliedToResidues is not None:
            atom_indices = []
            for i, atom in enumerate(self.dna.atoms):
                # Assumes atoms expose a residue identifier as `resid`.
                # If your topology uses a different attribute, update this accordingly.
                resid = getattr(atom, "resid", None)
                if resid in self.appliedToResidues:
                    atom_indices.append(i)
        else:
            atom_indices = range(len(self.dna.atoms))

        for i in atom_indices:
            # Use the actual particle mass so the CV represents a true COM, not an unweighted average.
            mass = self.dna.system.getParticleMass(i).value_in_unit(unit.dalton)

            # Add particle to sum_x, sum_y, sum_z with its mass as the weight
            self.force.getCollectiveVariable(0).addParticle(i, [mass])
            self.force.getCollectiveVariable(1).addParticle(i, [mass])
            self.force.getCollectiveVariable(2).addParticle(i, [mass])

            total_mass += mass

        if total_mass == 0:
            raise ValueError("No atoms were selected for the group constraint; check appliedToResidues.")
  1. Ensure that the atom objects in self.dna.atoms expose the residue identifier as atom.resid. If a different attribute is used (e.g. atom.residue.id or atom.residue.index), adjust the resid = getattr(atom, "resid", None) line and the subsequent membership test accordingly.
  2. Confirm that unit is imported in this module (e.g. from openmm import unit or import openmm.unit as unit) so that value_in_unit(unit.dalton) works as expected.
  3. If the codebase already has a helper for residue-based atom selection (e.g. on DNAForce or dna), you may prefer to replace the explicit loop that builds atom_indices with that helper to keep residue-selection logic consistent across forces.

Comment thread open3SPN2/force/protein_dna.py
cabb99 added a commit that referenced this pull request Jun 4, 2026
Brings in Steven Luo's production-run and analysis scripts. The overlapping
force-code changes are taken from PR #27 (the consolidated rework of the same
work); PR #26's multi-chain string classes are superseded there and will be
folded into a single generalized class during the follow-up refactor.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant