Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many parameters (fc layers) in both cnn encoder and rnn decoder, causing dramatic overfitting! #51

Open
mashijie1028 opened this issue Aug 18, 2021 · 2 comments

Comments

@mashijie1028
Copy link

mashijie1028 commented Aug 18, 2021

There are so many fc layers in both CNN encoder and RNN decoder, only one is enough. When I implement the CRNN training, I got over 70% test acc with only one fc layer in both CNN and LSTM (However, there is still a huge overfitting). When the num_fc_layers increases, the performance degrades.

Plus, BatchNorm probably contradicts with dropout, because dropout could affect the statistics of BN, BN is already a regularizer. Maybe no dropout is better.

@mashijie1028
Copy link
Author

mashijie1028 commented Aug 18, 2021

I was wondering how you could get 85.68% test acc in ResNet-152 + LSTM, could you please tell me the hyper-parameters? Thanks!
@HHTseng

@mashijie1028
Copy link
Author

I use ResNet-18(pretrained) + LSTM and get over 80% test acc, but only 40% test acc when training ResNet-18 + LSTM from scratch. It seems that pretraining ResNet CNN encoder on ImageNet is essential.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant