Train dtype sampling during training and clarifications. #360
-
Hello! I noticed that sampling during training uses the train_data dtype, is there any particular reason for using that instead of of the model dtype? Does the train data dtype effect mainly the dtype of input data and the gradients as opposed to directly effecting the dtype of the model weights? That was what I had gathered but wanted to verify. I saw that specifically it stated: "Internally, this sets the mixed precision data type when doing the forward pass through the model. This setting trades precision for speed during training" |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Model type affects the precision with which the model weights are loaded, train type affects the precision with which the training occurs in is my understanding. |
Beta Was this translation helpful? Give feedback.
Model type affects the precision with which the model weights are loaded, train type affects the precision with which the training occurs in is my understanding.