-
-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KL-Loss very large #13
Comments
All three parts are needed in training, u cannot just optimize |
Thanks for your reply, I am training COCO + YOLOv3, now bbox_pred_std_abs_logw_loss is a very small negative number, and it will get smaller and smaller. |
yes, but as I said, bbox_pred_std_abs_mulw_loss, bbox_pred_std_abs_logw_loss produces gradients for std, loss_bbox produces gradients for bbox. |
but, bbox_pred_std_abs_mulw_loss gradients will also go to bbox and std, because of bbox_pred_std_abs_mulw_loss=bbox_pred_std_abs*('bbox_pred'- 'bbox_targets'), resulting in a final loss=nan |
@csu-xiewei No, stopgradient prevents gradient goes to bbox |
The kl-loss has three parts :bbox_pred_std_abs_mulw_loss, bbox_pred_std_abs_logw_loss,loss_bbox.
When I add it, bbox_pred_std_abs_logw_loss will be a very large negative number, resulting in a final loss=nan. If only loss_bbox is optimized, then log_loss will become a very large positive number, making the final loss_bbox almost zero. How to calculate kl-loss I can reproduce in the code, but how can you train me in the end, can you help me?
The text was updated successfully, but these errors were encountered: