-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trained discriminator not working after netD.eval() or shuffle = False #10
Comments
Happy to help diagnose the issue. Can you send along two curves of AUROC at
epochs: one with netD.eval() and the other without? The shuffle thing
should not be crucial.
Shu
…On Mon, Jul 4, 2022 at 12:09 AM 1214710638 ***@***.***> wrote:
hello, thanks for open source your code. I notice that in
demo_CrossDatasetOpenSet_testing.ipynb, you didn't set netD.eval() before
testing, and the dataloader definition sets shuffle=True. I train and test
the model follow your given demo. however, when i set netD.eval() or
shuffle=True in dataloader, the test result is not good, the discriminator
is not working at all. am i missing something or any suggestion for me?
—
Reply to this email directly, view it on GitHub
<#10>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABRJSJACWTG3FR3OGHJISMDVSJPXLANCNFSM52RZVGJA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks for the curves! I recall that I observed the same in my experiments,
that's why I didn't use .eval(). I conjecture that this is because the
discriminator is trained in the train mode for the best discriminative
between outliers and closed-set; if turning on the eval mode, the
discriminator can't work well. I can't remember if tried this or not -- set
batch size as 1 and compare between the eval and train modes again. Would
you like to try them out?
Shu
…On Mon, Jul 4, 2022 at 3:04 AM 1214710638 ***@***.***> wrote:
[image: 企业微信截图_16569180502202]
<https://user-images.githubusercontent.com/24508284/177099380-3d3a2193-2bdd-4aa7-90a7-638a8762df41.png>
[image: 企业微信截图_16569180758575]
<https://user-images.githubusercontent.com/24508284/177099450-6311852d-399d-4f37-894c-afb4d30356d5.png>
Also, there are curves for mean confidence scores, you can see the one
with netD.eval() is unseparatable.
—
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABRJSJA5VD3HESTGHLAOWG3VSKEGTANCNFSM52RZVGJA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
i checked more setups, and all of them didn't work under eval mode. i look close into your released test demo and i've discussed it with some other people. if you use train mode and batch to produce the test result, it might be kind of cheating. when you use train mode, it compute batch norm stats from the given batch, and in your setup, you use a batch of all open-set samples(or close-set samples). therefore, you took advantage of the label information or the test distribution indirectly to produce the test results which is an unfair comparison for other methods and could discredit your claims in the paper. you might took a closer look into this and might clarify this with more experiments. |
Thanks for the discussion! This is a fair point and I agree that there can
be information leakage in the testing implementation. That made me think
about setting batch size as 1 when testing with the train mode. I'll try
this later; I'd appreciate it if you can help try this by reusing your
trained models.
Shu
…On Mon, Jul 4, 2022 at 9:40 PM 1214710638 ***@***.***> wrote:
i checked more setups, and all of them didn't work under eval mode. i look
close into your released test demo and i've discussed it with some other
people. if you use train mode and batch to produce the test result, it
might be kind of cheating. when you use train mode, it compute batch norm
stats from the given batch, and in your setup, you use a batch of all
open-set samples(or close-set samples). therefore, you took advantage of
the label information or the test distribution indirectly to produce the
test results which is an unfair comparison for other methods and could
discredit your claims in the paper. you might took a closer look into this
and might clarify this with more experiments.
—
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABRJSJAOM2NN3WJ6M6FULM3VSOG7ZANCNFSM52RZVGJA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
In my experiments, even when turn on .train() mode in test phase, the performance still largely rely on batchsize, the order of test sequence, etc, which makes this idea hard to follow:( |
When testing with eval mode I also get random results. That is |
hello, thanks for open source your code. I notice that in demo_CrossDatasetOpenSet_testing.ipynb, you didn't set netD.eval() before testing, and the dataloader definition sets shuffle=True. I train and test the model follow your given demo. however, when i set netD.eval() or shuffle=True in dataloader, the test result is not good, the discriminator is not working at all. am i missing something or any suggestion for me?
The text was updated successfully, but these errors were encountered: