Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error message when inconsistent files are specified in read_from for energies and forces #440

Open
SanggyuChong opened this issue Jan 6, 2025 · 0 comments
Labels
Good first issue Good for newcomers Infrastructure: Data Related to data handling like readers and datasets Priority: Medium Important issues to address after high priority.

Comments

@SanggyuChong
Copy link
Contributor

Hello,

Thanks a lot for all your efforts in mtt development. Nice to be on the user's end!

I committed a mistake in model training today, of providing different file names for the energy and forces to read_from:

training_set:
  systems:
    read_from: VALID.xyz
    length_unit: angstrom
  targets:
    energy:
      key: "DFT_energy"
      unit: "eV" # very important to run simulations
      forces:
        read_from: TRAIN.xyz
        key: "DFT_forces"

Doing something like this currently gives the following error thread:

Traceback (most recent call last):
 File "/home/chong/mtt_venv/lib/python3.11/site-packages/metatrain/cli/train.py", line 400, in train_model
  trainer.train(
 File "/home/chong/mtt_venv/lib/python3.11/site-packages/metatrain/experimental/soap_bpnn/trainer.py", line 336, in train
  targets = remove_additive(
       ^^^^^^^^^^^^^^^^
 File "/home/chong/mtt_venv/lib/python3.11/site-packages/metatrain/utils/additive/remove.py", line 66, in remove_additive
  metatensor.torch.TensorBlock(
RuntimeError: invalid parameter: data and labels don't match: the array shape along axis 0 is 96 but we have 160 sample labels

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "/home/chong/mtt_venv/lib/python3.11/site-packages/metatrain/__main__.py", line 107, in main
  train_model(**args.__dict__)
 File "/home/chong/mtt_venv/lib/python3.11/site-packages/metatrain/cli/train.py", line 409, in train_model
  raise ArchitectureError(e)
metatrain.utils.errors.ArchitectureError: invalid parameter: data and labels don't match: the array shape along axis 0 is 96 but we have 160 sample labels

The error above most likely originates from an architecture.

If you think this is a bug, please contact its maintainer (see the architecture's documentation) and include the full traceback error.log.

While the user is ultimately responsible for ensuring that correct file names are provided, @Luthaf has asked me to raise this issue to discuss a way in which one can catch these mistakes with a more straightforward error message.

Thanks a lot!

Best,
Raymond

@frostedoyster frostedoyster self-assigned this Jan 11, 2025
@frostedoyster frostedoyster added Good first issue Good for newcomers Priority: Medium Important issues to address after high priority. Infrastructure: Data Related to data handling like readers and datasets labels Jan 11, 2025
@frostedoyster frostedoyster removed their assignment Jan 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Good first issue Good for newcomers Infrastructure: Data Related to data handling like readers and datasets Priority: Medium Important issues to address after high priority.
Projects
None yet
Development

No branches or pull requests

2 participants