Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 1998 #2018

Merged
merged 9 commits into from
Feb 13, 2025
Merged

Issue 1998 #2018

merged 9 commits into from
Feb 13, 2025

Conversation

jaclark5
Copy link
Contributor

@jaclark5 jaclark5 commented Feb 10, 2025

@jaclark5 jaclark5 requested a review from j-wags February 10, 2025 18:55
Copy link

codecov bot commented Feb 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.11%. Comparing base (9934858) to head (0f3b6a1).
Report is 2 commits behind head on main.

Additional details and impacted files

@jaclark5
Copy link
Contributor Author

In the doctoring for FrozenMolecules.to_dict I left out reference to cached_properties since it seems like they aren't used.

In my description of the properties output, I described atom_map and left the rest ambiguous. I could make it more specific by specifying that the remainder of the dictionary entries are the output from OEGetSDDataIter and GetPropsAsDict for OpenEye and RDKit respectively.

@jaclark5 jaclark5 removed the request for review from j-wags February 10, 2025 22:47
Copy link
Member

@mattwthompson mattwthompson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % two tiny errors I found (annotating the units as float) which I believe should be updated before merge.

I found a lot of other details that either are out of scope here and don't need fixing or can go either way at your discretion, a.k.a. non-blocking and don't require a re-review unless you wish for one

openff/toolkit/topology/molecule.py Outdated Show resolved Hide resolved
Comment on lines 1178 to 1183
- **properties** (dict): Outputs from chosen a chosen toolkit:

- **atom_map** (dict): Dictionary of atom index (as in ``atoms`` entry) and the mapped index relevant
to a mapped canonical smiles string
- **\*\*kwargs**: Other toolkit dependent outputs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dict is empty by default, i.e. not including the atom map, so I'd prefer either not listing it here or being more general about what can go in. IIRC it's a dumping ground for just about anything that can be represented in memory

In [10]: Molecule.from_smiles("CCO").to_dict()['properties']
Out[10]: {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I made a comment on the PR wondering if I should be more specific since I can see the functions used by the toolkit for the other kwargs. I'll take that as a no :)

I can see that if this is a dumping ground for metadata, it shouldn't be specified, but in the case of atom_map, I might be uploading a molecule with a mapped CMILES and a geometry I got elsewhere... in that case I might need to figure out how to do this manually, which is why I thought it would be useful to explain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the doctoring of to_smiles it is encouraged for a user to supply an atom_map for this purpose, but doesn't define its format. I think it's important to define it, and I believe this is the place to do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah totally my mistake, I didn't read the second half of that comment

IMHO

  • I have a preference for whatever gets this over the finish line with the lowest friction (I think the original objective of structure of Molecule-generated dictionaries is more than met)
  • The added docs/examples you're considering adding are valuable but may be better for different portions of the docs than the docstrings of some serialization methods. I know CMILES is covered in the molecule cookbook but I forget the details

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that it is "verbally" explained in from_smiles I can remove this if you want, or you can save me the commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either is fine with me, tiebreaker going to "do nothing because that's quicker"

(I tagged the unit string as the only blocker, everything else is up to you)

openff/toolkit/topology/molecule.py Outdated Show resolved Hide resolved
openff/toolkit/topology/molecule.py Outdated Show resolved Hide resolved
@jaclark5 jaclark5 merged commit 2f7bd4d into main Feb 13, 2025
12 of 22 checks passed
@jaclark5 jaclark5 deleted the issue_1998 branch February 13, 2025 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document that Molecule.from_dict accepts a list of lists but not a numpy array
2 participants