Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory,when I pruned model test #19

Open
Tianxiaomo opened this issue May 15, 2018 · 4 comments
Open

Out of memory,when I pruned model test #19

Tianxiaomo opened this issue May 15, 2018 · 4 comments

Comments

@Tianxiaomo
Copy link

I use Tesla k80 -12G *4,When I prunning the training test everything was normal, but after the pruned test memory overflow.
THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "/home/b418-xiwei/.pycharm_helpers/pydev/pydevd.py", line 1664, in
main()
File "/home/b418-xiwei/.pycharm_helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/b418-xiwei/.pycharm_helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/b418-xiwei/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/b418-xiwei/hgh/prune/finetune.py", line 343, in
fine_tuner.prune()
File "/home/b418-xiwei/hgh/prune/finetune.py", line 267, in prune
self.test()
File "/home/b418-xiwei/hgh/prune/finetune.py", line 187, in test
output = model(Variable(batch))
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/b418-xiwei/hgh/prune/finetune.py", line 68, in forward
x = self.features(x)
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/pooling.py", line 142, in forward
self.return_indices)
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/functional.py", line 360, in max_pool2d
ret = torch._C._nn.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58

I use batch_size=16,so it is

@cc94226
Copy link

cc94226 commented Jun 2, 2018

What GPU are you using?

@CodePlay2016
Copy link

have you solve this problem? I found that in finetune.py line172 and line174, this two backward operation will cause doubling the memory usage twice, increasing my memory from 3200MB to 7000MB than to 11000MB. The first increment is used when getting the pruning plan, so the gradient calculated is useless when finetuning, but I haven't found any way to clear that gradient cache.

@weixia1
Copy link

weixia1 commented Jul 13, 2018

@Tianxiaomo hi, can you tell me the command of test model,i can‘t find it,thy.

@buttercutter
Copy link

buttercutter commented Oct 8, 2018

@CodePlay2016 I am facing the almost similar out-of-memory problem

Could you comment about this ? Do you have any actual, working countermeasure so far ?

[phung@archlinux pytorch-pruning]$ python finetune.py --prune
/usr/lib/python3.7/site-packages/torchvision/transforms/transforms.py:187: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  warnings.warn("The use of the transforms.Scale transform is deprecated, " +
/usr/lib/python3.7/site-packages/torchvision/transforms/transforms.py:562: UserWarning: The use of the transforms.RandomSizedCrop transform is deprecated, please use transforms.RandomResizedCrop instead.
  warnings.warn("The use of the transforms.RandomSizedCrop transform is deprecated, " +
Accuracy:  0.5848
Number of prunning iterations to reduce 67% filters 5
Ranking filters.. 
Traceback (most recent call last):
  File "finetune.py", line 270, in <module>
    fine_tuner.prune()
  File "finetune.py", line 217, in prune
    prune_targets = self.get_candidates_to_prune(num_filters_to_prune_per_iteration)
  File "finetune.py", line 184, in get_candidates_to_prune
    self.train_epoch(rank_filters = True)
  File "finetune.py", line 179, in train_epoch
    self.train_batch(optimizer, batch.cuda(), label.cuda(), rank_filters)
  File "finetune.py", line 172, in train_batch
    self.criterion(output, Variable(label)).backward()
  File "/usr/lib/python3.7/site-packages/torch/tensor.py", line 96, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/usr/lib/python3.7/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: CUDA error: out of memory
[phung@archlinux pytorch-pruning]$ 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants