Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lower reduction factor #5

Open
King4819 opened this issue Nov 24, 2024 · 4 comments
Open

Lower reduction factor #5

King4819 opened this issue Nov 24, 2024 · 4 comments

Comments

@King4819
Copy link

Awesome work! I want to ask that how does the method work at lower reduction factor. In your paper, the lowest reduction factor is 0.45, is the method still work at lower reduction factor like 0.1 ? Hopes to get your reply, thanks !

@Mxbonn
Copy link
Owner

Mxbonn commented Nov 24, 2024

Awesome work! I want to ask that how does the method work at lower reduction factor. In your paper, the lowest reduction factor is 0.45, is the method still work at lower reduction factor like 0.1 ? Hopes to get your reply, thanks !

Once you go below 0.4 the accuracy really start to degrade quickly. If your model is still too large or slow at this reduction factor it is better to look for a smaller baseline and apply some moderate token merging/pruning there.

@King4819
Copy link
Author

@Mxbonn Thanks for your reply. Despite the accuracy, I'm wonder that can the model learn to reduce its size to extreme low reduction factor (e.g., 0.1), since I encounter the problem that when I set reduction factor to 0.1, the result model is still big like its reduction factor is 0.4. I'm very curious about this problem, Hopes to get your reply, thanks !

@Mxbonn
Copy link
Owner

Mxbonn commented Nov 24, 2024

It should be possible to reduce to such low reduction factors, but I personally have never tried it. The reduction factor is achieved through the loss function. However the loss function is a combination of the classification loss and the FLOPs reduction factor loss. You can try to set increase the weight of the regularization by adding the --reg-scale= flag and using a larger value, the current default is 10. You can also try to increase the number of training epochs since such a drastic reduction might need more training in order converge.

@King4819
Copy link
Author

King4819 commented Nov 25, 2024

@Mxbonn Thanks for your reply. I have tried setting reduction_target to 0.3, and the result model GFLOPs is 1.9 GFLOPs, which is not so accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants