Replies: 4 comments
-
Opened an issue: #20582. |
Beta Was this translation helpful? Give feedback.
-
If you are using tf backend, you can use gradient accumulation old TF way with it. |
Beta Was this translation helpful? Give feedback.
-
What is this old way? |
Beta Was this translation helpful? Give feedback.
-
I wanted to try this as well, but the problem currently appears to be that MirroredStrategy is broken in the latest versions of keras. see #21061 Besides that, I don't think there is an official old way. That's why there were a few wrappers in the wild as for example this one here: https://gradientaccumulator.readthedocs.io/en/latest/background/gradient_accumulation.html |
Beta Was this translation helpful? Give feedback.
-
Does anybody know if the implemented gradient accumulation can run under
tf.distribute.MirroredStrategy
?Beta Was this translation helpful? Give feedback.
All reactions