Training logs the final checkpoint even if `checkpoint_every=0` #666

debrevitatevitae · 2025-02-07T10:10:30Z

Short description

There is no way ATM to avoid checkpointing altogether when training.
The docs mention checkpoint_every=0, disables checkpointing altogether (see here), but this is not true , because the final checkpoint is anyway logged.

What is the expected result?

Option to avoid all checkpointing. Useful for

testing
prototyiping
when logging is done externally
etc

What is the actual result?

The final model/optimizer states are checkpointed even though TrainConfig.checkpoint_every=0.

Steps/Code to reproduce

MWE:

from __future__ import annotations

import torch
from torch.utils.data import DataLoader

from qadence.circuit import QuantumCircuit
from qadence.constructors import hea
from qadence.constructors.feature_maps import feature_map
from qadence.constructors.hamiltonians import hamiltonian_factory
from qadence.ml_tools.config import TrainConfig
from qadence.ml_tools.data import to_dataloader
from qadence.ml_tools.models import QNN
from qadence.ml_tools.trainer import Trainer
from qadence.operations.primitive import Z

n_qubits = 2
ansatz_depth = 1


def dataloader(batch_size: int = 25) -> DataLoader:
    x = torch.linspace(0, 1, batch_size).reshape(-1, 1)
    y = torch.cos(x)
    return to_dataloader(x, y, batch_size=batch_size, infinite=True)


obs = hamiltonian_factory(register=n_qubits, detuning=Z)

data = dataloader()
fm = feature_map(n_qubits, param="x")

model = QNN(
    QuantumCircuit(n_qubits, fm, hea(n_qubits, ansatz_depth)),
    observable=obs,
    inputs=["x"],
)

optimizer = torch.optim.Adam(model.parameters(), lr=0.1)

config = TrainConfig(max_iter=100, checkpoint_every=0)

trainer = Trainer(model=model, optimizer=optimizer, config=config)
trainer.fit(data)

Tracebacks (optional)

Environment details (optional)

qadence=="1.10.1"

Would you like to work on this issue?

Yes

The text was updated successfully, but these errors were encountered:

mlahariya · 2025-02-12T14:24:16Z

This is the legacy behaviour of the callbacks. Initially it was such that the checkpoint will be saved at the end of training irrespective. And a checkpoint will be saved in training start whenever validation is used. #593 (comment)

@debrevitatevitae - We can definitely change this. Would you like to work on this? Or I can pick it up.

debrevitatevitae · 2025-02-13T09:48:39Z

Glad to work on it ;)

debrevitatevitae added the bug Something isn't working label Feb 7, 2025

mlahariya self-assigned this Feb 11, 2025

mlahariya mentioned this issue Feb 12, 2025

[Feature] Distributed Training for Trainer #672

Draft

8 tasks

mlahariya assigned debrevitatevitae and unassigned mlahariya Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training logs the final checkpoint even if `checkpoint_every=0` #666

Training logs the final checkpoint even if `checkpoint_every=0` #666

debrevitatevitae commented Feb 7, 2025

mlahariya commented Feb 12, 2025 •

edited

Loading

debrevitatevitae commented Feb 13, 2025

Training logs the final checkpoint even if checkpoint_every=0 #666

Training logs the final checkpoint even if checkpoint_every=0 #666

Comments

debrevitatevitae commented Feb 7, 2025

Short description

What is the expected result?

What is the actual result?

Steps/Code to reproduce

Tracebacks (optional)

Environment details (optional)

Would you like to work on this issue?

mlahariya commented Feb 12, 2025 • edited Loading

debrevitatevitae commented Feb 13, 2025

Training logs the final checkpoint even if `checkpoint_every=0` #666

Training logs the final checkpoint even if `checkpoint_every=0` #666

mlahariya commented Feb 12, 2025 •

edited

Loading