Topology.identical_molecule_groups
scales poorly with many large molecules
#2008
Labels
protein-performance
Possibly related to speed of loading or parametrizing proteins
Describe the bug
Topology.identical_molecule_groups
scales super-linearly with molecule size when multiple large components are present in a system. This makes parametrizing large polymer systems unworkably slow. I believe this is a root cause of openforcefield/openff-interchange#1156To Reproduce
This is a simple and imperfect reproduction, but it shows that when multiple copies of a large molecule are present, runtime explodes when that molecule is a few hundred or thousand heavy atoms, which is not necessarily a large system in materials science.
Output
Additional context
This script runs into memory issues with large molecules, which I haven't written up yet. There might be easier ways to prepare these topologies but this is fast in human time.
The text was updated successfully, but these errors were encountered: