Skip to content

Commit

Permalink
exit from deadlock if training failed
Browse files Browse the repository at this point in the history
  • Loading branch information
denniswittich committed Feb 1, 2024
1 parent 23e738a commit e9c3999
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion learning_loop_node/trainer/trainer_logic.py
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,9 @@ async def upload_model(self) -> None:
try:
new_model_id = await self._upload_model_return_new_id(self.training.context)
if new_model_id is None:
raise Exception('could not upload model - maybe training failed')
self.training.training_state = TrainingState.ReadyForCleanup
logging.error('could not upload model - maybe training failed.. cleaning up')
return
assert new_model_id is not None, 'uploaded_model must be set'
logging.info(f'successfully uploaded model and received new model id: {new_model_id}')
self.training.model_id_for_detecting = new_model_id
Expand Down

0 comments on commit e9c3999

Please sign in to comment.