Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not enough RAM #4

Open
atonn opened this issue Jun 2, 2016 · 3 comments
Open

not enough RAM #4

atonn opened this issue Jun 2, 2016 · 3 comments

Comments

@atonn
Copy link

atonn commented Jun 2, 2016

Hello, first of all thank you for uploading your training code!

Some issues I encountered:

  1. train.lua was missing lib/tv.lua and lib/tv2.lua. I used the tv.lua from your cnn-mrf repository and commented out the line for tv2.

  2. new users, don't forget to set flag_MDAN = true, flag_AG =true, flag_MGAN = true in train.lua if you actually want it to do the whole training process. Only the first step, flag_MDAN, is set true by default.

  3. I had to set opt.batchSize in MGAN_wrapper.lua from 64 to 32 since my 4 GB GPU ran out of memory at this step.

  4. In order to successfully run release_MGAN.lua, I had to manually move the file netS_1.t7 from Dataset/VG_Alpilles_ImageNet100/MDAN to Dataset/VG_Alpilles_ImageNet100/MGAN.

  5. Finally, when running demo_MGAN.lua, I am stuck with:

`/home/atonn/torch/install/bin/luajit: /home/atonn/torch/install/share/lua/5.1/nn/Container.lua:67:
In 23 module of nn.Sequential:
/home/atonn/torch/install/share/lua/5.1/nn/THNN.lua:109: bad argument #2 to 'v' (input channels and nInputPlane dont match at /tmp/luarocks_cunn-scm-1-2642/cunn/lib/THCUNN/SpatialConvolutionMM.cu:29)
stack traceback:
[C]: in function 'v'
/home/atonn/torch/install/share/lua/5.1/nn/THNN.lua:109: in function 'SpatialConvolutionMM_updateOutput'
...id/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:111: in function <...id/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:107>
[C]: in function 'xpcall'
/home/atonn/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/home/atonn/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
demo_MGAN.lua:56: in main chunk
[C]: in function 'dofile'
...poid/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670

WARNING: If you see a stack trace below, it doesn't point to the place where this error occured. Please use only the one above.
stack traceback:
[C]: in function 'error'
/home/atonn/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
/home/atonn/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
demo_MGAN.lua:56: in main chunk
[C]: in function 'dofile'
...poid/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670`

I suppose that's what I get for messing with the batch size? When substituting my MGAN file with the provided pretrained file it runs just fine. Any help would be appreciated. Cheers :D

@chuanli11
Copy link
Owner

Hi, atonn

Thanks a lot for the feedbacks. I uploaded the missing tv files and it should solve all your problems.

@atonn
Copy link
Author

atonn commented Jun 3, 2016

Thank you! All of my problems were indeed solved. The results don't look as good as those from your provided models, but I guess that is due to the batchSize of 32?

photo_2016-06-03_20-37-12

@hankobe
Copy link

hankobe commented Jun 23, 2017

HI, @atonn
How about now? Is the results you obtained good or not?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants