Skip to content

calculate score calculation within callback #21076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pure-rgb opened this issue Mar 20, 2025 · 3 comments
Open

calculate score calculation within callback #21076

pure-rgb opened this issue Mar 20, 2025 · 3 comments

Comments

@pure-rgb
Copy link

import keras

def get_model():
    model = keras.Sequential()
    model.add(keras.layers.Dense(1))
    model.compile(
        optimizer=keras.optimizers.RMSprop(learning_rate=0.1),
        loss="mean_squared_error",
        metrics=["mean_absolute_error"],
    )
    return model

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = x_train.reshape(-1, 784).astype("float32") / 255.0
x_test = x_test.reshape(-1, 784).astype("float32") / 255.0
x_train = x_train[:1000]
y_train = y_train[:1000]
x_test = x_test[:1000]
y_test = y_test[:1000]

class CustomCallback(keras.callbacks.Callback):
    def __init__(self, x, y):
        super().__init__()
        self.x = x
        self.y = y
        
    def on_epoch_end(self, epoch, logs=None):
        y_pred = self.model.predict(self.x, verbose=0)
        score = self.model.compute_metrics(self.x, self.y, y_pred, sample_weight=None)
        print()
        print(score)

model = get_model()
model.fit(
    x_train,
    y_train,
    batch_size=256,
    epochs=5,
    verbose=1,
    callbacks=[CustomCallback(x_train, y_train)],
)
Epoch 1/5
1/4 ━━━━━━━━━━━━━━━━━━━━ 1s 526ms/step - loss: 25.3436 - mean_absolute_error: 4.2441
{'loss': 242.523193359375, 'mean_absolute_error': 6.016280174255371}
4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 57ms/step - loss: 256.4755 - mean_absolute_error: 10.3880
Epoch 2/5
1/4 ━━━━━━━━━━━━━━━━━━━━ 0s 27ms/step - loss: 6.6658 - mean_absolute_error: 2.1646
{'loss': 6.03378438949585, 'mean_absolute_error': 1.8999731540679932}
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 40ms/step - loss: 6.2043 - mean_absolute_error: 2.0708
Epoch 3/5
1/4 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step - loss: 4.1691 - mean_absolute_error: 1.6324
{'loss': 4.564587593078613, 'mean_absolute_error': 1.7218464612960815}
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step - loss: 4.4746 - mean_absolute_error: 1.7039
Epoch 4/5
1/4 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step - loss: 4.4333 - mean_absolute_error: 1.7299
{'loss': 4.227972030639648, 'mean_absolute_error': 1.6317805051803589}
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 43ms/step - loss: 4.2346 - mean_absolute_error: 1.6612
Epoch 5/5
1/4 ━━━━━━━━━━━━━━━━━━━━ 0s 24ms/step - loss: 3.7971 - mean_absolute_error: 1.5549
{'loss': 5.39981746673584, 'mean_absolute_error': 2.682666063308716}
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step - loss: 4.6834 - mean_absolute_error: 1.7220
<keras.src.callbacks.history.History at 0x7fa5642bba60>
  1. About printing log, why printing caused in the middle of epoch, when it should be after finishing the epoch.
Epoch 1/5
1/4 ━━━━━━━━━━━━━━━━━━━━ 1s 526ms/step - loss: 25.3436 - mean_absolute_error: 4.2441
{'loss': 242.523193359375, 'mean_absolute_error': 6.016280174255371}
4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 57ms/step - loss: 256.4755 - mean_absolute_error: 10.3880
  1. About score, it should match but callback gives loss 242 and logs gives 256 - callback gives mae 6.01 and log gives 10.3.
@pctablet505
Copy link
Collaborator

The loss and metrics displayed in progress bar is for each batch or mini-batch, where as the output in next line is for each epoch.

@pure-rgb
Copy link
Author

ok.

Now, if we do this, then I should able to get them matched.

x_train = x_train[:256]
y_train = y_train[:256]
x_test = x_test[:256]
y_test = y_test[:256]

model = get_model()
model.fit(
    x_train,
    y_train,
    batch_size=256,
    epochs=5,
    verbose=1,
    callbacks=[CustomCallback(x_train, y_train)],
)
Epoch 1/5
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 405ms/step - loss: 25.9250 - mean_absolute_error: 4.1543
{'loss': 25.925048828125, 'mean_absolute_error': 16.05439567565918}
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 490ms/step - loss: 25.9250 - mean_absolute_error: 4.1543

The loss get matched, but why not metrics?

@JyotinderSingh
Copy link
Collaborator

Why the Callback Print Appears "Mid-Epoch"

  • The on_epoch_end method in your CustomCallback is only executed after all the training batches for that epoch (all 4 batches, in this case) have been completed by model.fit. What you're seeing in the output is usually an artifact of how different print operations interact with the console output buffer.
  • Your callback uses a standard print() statement. Sometimes, the callback's print output might flush to the console before the final summary line (4/4 ...) from the progress bar flushes, or they might get interleaved slightly.

Why the Callback's score Metrics Don't Match the fit Log

Keras metrics (like MeanAbsoluteError and even the loss tracked as a metric) are stateful. They maintain internal variables that get updated over time. model.fit calls reset_state() on these metrics at the start of each epoch.

self.reset_metrics()

For each batch, fit calls update_state() with that batch's results.

The values logged by fit (e.g., loss: 256.4755, mean_absolute_error: 10.3880 on the final 4/4 line for Epoch 1) are the result() calculated from the metric's state after being updated incrementally across all the batches in that epoch. It represents the average performance during training for that epoch.

At on_epoch_end (after fit is done with batches), your code runs self.model.predict(...) to get predictions for the entire x_train.

Then, score = self.model.compute_metrics(...) is called.

This compute_metrics call performs another update_state() on the same, existing stateful metric objects that fit was using.

This update uses all the 1000 samples. It happens without resetting the state first.

The score you print (e.g., {'loss': 242.52..., 'mean_absolute_error': 6.01...} for Epoch 1) is the result() calculated after this additional bulk update_state.

Since the internal state of the metric objects is different when result() is called in these two scenarios, the numbers don't match.

One guess for possibly large discrepancies you may be seeing in the initial epochs is that the first mini-batch may have a large enough MAE value, which when averaged across all batches ends up messing with the metrics being printed.

@JyotinderSingh JyotinderSingh removed the keras-team-review-pending Pending review by a Keras team member. label Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants