Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 1998 #2018

Merged
merged 9 commits into from
Feb 13, 2025
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ __pycache__/
# OS X extra files
*.DS_Store

# VSCode
*.vscode

# Distribution / packaging
.Python
env/
Expand Down
1 change: 1 addition & 0 deletions docs/releasehistory.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ Releases follow the `major.minor.micro` scheme recommended by [PEP440](https://w

- [PR #1981](https://github.com/openforcefield/openff-toolkit/pull/1981): Updates documentation to run quicker and use new features, including some in Interchange 0.4.
- [PR #2007](https://github.com/openforcefield/openff-toolkit/pull/2007): Documents that the dicts expected by various `from_dict` methods are expected to match the structure of the results of corresponding `to_dict` methods.
- [PR #2018](https://github.com/openforcefield/openff-toolkit/pull/2018): Interlinking of docstrings for further clarity in use of `from_dict` and `to_dict` for `Atom`, `Bond`, and `Molecule` classes.

### Miscellaneous

Expand Down
71 changes: 49 additions & 22 deletions openff/toolkit/topology/molecule.py
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,11 @@ def add_bond(self, bond: "Bond"):
self._bonds.append(bond)

def to_dict(self) -> dict[str, Union[None, str, int, bool, dict[Any, Any]]]:
"""Return a dict representation of the atom."""
"""Return a dict representation of the :class:`Atom` class instance.

Output dictionary keys and values align with parameters used to initialize
the :class:`Atom` class.
"""
# TODO: Should this be implicit in the atom ordering when saved?
# atom_dict['molecule_atom_index'] = self._molecule_atom_index
return {
Expand All @@ -315,10 +319,10 @@ def to_dict(self) -> dict[str, Union[None, str, int, bool, dict[Any, Any]]]:

@classmethod
def from_dict(cls: type[A], atom_dict: dict) -> A:
"""
Create an Atom from a dict representation.
"""Create an :class:`Atom` class instance from a dict representation.

The structure of the dict expected by this function is defined by the output of `Atom.to_dict()`.
The structure of the dict expected by this function is defined by the output of
:meth:`Atom.to_dict()`.
"""
return cls(**atom_dict)

Expand Down Expand Up @@ -696,8 +700,10 @@ def __init__(
self._stereochemistry = stereochemistry

def to_dict(self) -> dict[str, Union[int, bool, str, float]]:
"""
Return a dict representation of the bond.
"""Return a ``dict`` representation of the bond.

The output dictionary keys and values align with parameters used to initialize
the :class:`Bond` class.

"""
return {
Expand All @@ -714,7 +720,8 @@ def from_dict(cls: type[B], molecule: FM, d: dict) -> B: # type: ignore[overrid
"""
Create a Bond from a dict representation.

The structure of the dict expected by this function is defined by the output of `Bond.to_dict()`.
The structure of the dict expected by this function is defined by the output of
:meth:`Bond.to_dict()`.
"""
# TODO: This is not used anywhere (`Molecule._initialize_bonds_from_dict()` just calls grabs
# the two atoms and calls `Molecule._add_bond`). Remove or change that?
Expand Down Expand Up @@ -1150,7 +1157,7 @@ def strip_atom_stereochemistry(
####################################################################################################

def to_dict(self) -> dict:
"""
r"""
Return a dictionary representation of the molecule.

.. todo ::
Expand All @@ -1163,6 +1170,27 @@ def to_dict(self) -> dict:
molecule_dict
A dictionary representation of the molecule.

- **name** (str): An optional name to be associated with the molecule
- **atoms** (list[dict]): A list of dictionary inputs for :meth:`Atom.from_dict()`
- **bonds** (list[dict]): A list of dictionary inputs for :meth:`Bond.from_dict()`
- **conformers** (list[list]): A list containing the cartesian coordinates of each atoms in the order
mattwthompson marked this conversation as resolved.
Show resolved Hide resolved
defined in ``atoms``.
- **properties** (dict): Outputs from chosen a chosen toolkit:

- **atom_map** (dict): Dictionary of atom index (as in ``atoms`` entry) and the mapped index relevant
to a mapped canonical smiles string
- **\*\*kwargs**: Other toolkit dependent outputs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dict is empty by default, i.e. not including the atom map, so I'd prefer either not listing it here or being more general about what can go in. IIRC it's a dumping ground for just about anything that can be represented in memory

In [10]: Molecule.from_smiles("CCO").to_dict()['properties']
Out[10]: {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I made a comment on the PR wondering if I should be more specific since I can see the functions used by the toolkit for the other kwargs. I'll take that as a no :)

I can see that if this is a dumping ground for metadata, it shouldn't be specified, but in the case of atom_map, I might be uploading a molecule with a mapped CMILES and a geometry I got elsewhere... in that case I might need to figure out how to do this manually, which is why I thought it would be useful to explain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the doctoring of to_smiles it is encouraged for a user to supply an atom_map for this purpose, but doesn't define its format. I think it's important to define it, and I believe this is the place to do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah totally my mistake, I didn't read the second half of that comment

IMHO

  • I have a preference for whatever gets this over the finish line with the lowest friction (I think the original objective of structure of Molecule-generated dictionaries is more than met)
  • The added docs/examples you're considering adding are valuable but may be better for different portions of the docs than the docstrings of some serialization methods. I know CMILES is covered in the molecule cookbook but I forget the details

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that it is "verbally" explained in from_smiles I can remove this if you want, or you can save me the commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either is fine with me, tiebreaker going to "do nothing because that's quicker"

(I tagged the unit string as the only blocker, everything else is up to you)

- **hierarchy_schemes** (dict[dict]): Dictionary where keys (such as ``"residues"`` and ``"chains"``)
represent dictionary outputs from :meth:`HierarchyScheme.to_dict()`
- **conformers_unit** (float, default="angstrom"): Valid unit of length input for the
`OpenFF Units module <https://docs.openforcefield.org/projects/units/en/stable/api/generated/openff.units.html>`_.
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
- **partial_charges** (list[float], default=None): Array of partial charge (in elementary charges)
for atoms in the same order as the output,``atoms``.
- **partial_charge_unit** (float, default=None): Valid unit of charge input for the
`OpenFF Units module <https://docs.openforcefield.org/projects/units/en/stable/api/generated/openff.units.html>`_.
jaclark5 marked this conversation as resolved.
Show resolved Hide resolved
If ``partial_charges`` is also included, the default is ``"elementary_charge"`` instead.

"""
from openff.toolkit.utils.utils import serialize_numpy

Expand Down Expand Up @@ -1244,17 +1272,16 @@ def from_dict(cls: type[FM], molecule_dict: dict) -> FM:
"""
Create a new Molecule from a dictionary representation

The structure of the dict expected by this function is defined by the output of `Molecule.to_dict()`.

Parameters
----------
molecule_dict
A dictionary representation of the molecule.
A dictionary representation of the molecule defined by the inputs of
:meth:`Molecule.to_dict()`.

Returns
-------
molecule
A Molecule created from the dictionary representation
A :class:`Molecule` class instance created from the dictionary representation

"""
# This implementation is a compromise to let this remain as a classmethod
Expand Down Expand Up @@ -5876,14 +5903,13 @@ def __init__(

Parameters
----------

parent
The ``Molecule`` to which this scheme belongs.
The :class:`Molecule` to which this scheme belongs.
uniqueness_criteria
The names of ``Atom`` metadata entries that define this scheme. An
atom belongs to a ``HierarchyElement`` only if its metadata has the
The names of :class:`Atom` metadata entries that define this scheme. An
atom belongs to a :class:`HierarchyElement` only if its metadata has the
same values for these criteria as the other atoms in the
``HierarchyElement``.
:class:`HierarchyElement`.
iterator_name
The name of the iterator that will be exposed to access the hierarchy
elements generated by this scheme
Expand Down Expand Up @@ -5917,8 +5943,9 @@ def __init__(
self.hierarchy_elements: list[HierarchyElement] = list()

def to_dict(self) -> dict:
"""
Serialize this object to a basic dict of strings, ints, and floats
"""Serialize this object to a basic dict of strings and lists of ints.

Keys and values align with parameters used to initialize the :class:`HierarchyScheme` class.
"""
return_dict: dict[str, Union[str, Sequence[Union[str, int, dict]]]] = dict()
return_dict["uniqueness_criteria"] = self.uniqueness_criteria
Expand Down Expand Up @@ -6064,7 +6091,6 @@ def __init__(

Parameters
----------

scheme
The scheme to which this ``HierarchyElement`` belongs
identifier
Expand All @@ -6083,8 +6109,9 @@ def __init__(
setattr(self, uniqueness_component, id_component)

def to_dict(self) -> dict[str, Union[tuple[Union[str, int]], Sequence[int]]]:
"""
Serialize this object to a basic dict of strings and lists of ints.
"""Serialize this object to a basic dict of strings and lists of ints.

Keys and values align with parameters used to initialize the :class:`HierarchyElement` class.
"""
return {
"identifier": self.identifier,
Expand Down