Out of memory,when I pruned model test #19

Tianxiaomo · 2018-05-15T03:09:45Z

I use Tesla k80 -12G *4,When I prunning the training test everything was normal, but after the pruned test memory overflow.
THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "/home/b418-xiwei/.pycharm_helpers/pydev/pydevd.py", line 1664, in
main()
File "/home/b418-xiwei/.pycharm_helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/b418-xiwei/.pycharm_helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/b418-xiwei/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/b418-xiwei/hgh/prune/finetune.py", line 343, in
fine_tuner.prune()
File "/home/b418-xiwei/hgh/prune/finetune.py", line 267, in prune
self.test()
File "/home/b418-xiwei/hgh/prune/finetune.py", line 187, in test
output = model(Variable(batch))
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/b418-xiwei/hgh/prune/finetune.py", line 68, in forward
x = self.features(x)
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/pooling.py", line 142, in forward
self.return_indices)
File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/functional.py", line 360, in max_pool2d
ret = torch._C._nn.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58

I use batch_size=16,so it is

cc94226 · 2018-06-02T08:08:25Z

What GPU are you using?

CodePlay2016 · 2018-06-05T06:42:07Z

have you solve this problem? I found that in finetune.py line172 and line174, this two backward operation will cause doubling the memory usage twice, increasing my memory from 3200MB to 7000MB than to 11000MB. The first increment is used when getting the pruning plan, so the gradient calculated is useless when finetuning, but I haven't found any way to clear that gradient cache.

weixia1 · 2018-07-13T08:55:29Z

@Tianxiaomo hi， can you tell me the command of test model，i can‘t find it，thy.

buttercutter · 2018-10-08T04:59:07Z

@CodePlay2016 I am facing the almost similar out-of-memory problem

Could you comment about this ? Do you have any actual, working countermeasure so far ?

[phung@archlinux pytorch-pruning]$ python finetune.py --prune
/usr/lib/python3.7/site-packages/torchvision/transforms/transforms.py:187: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  warnings.warn("The use of the transforms.Scale transform is deprecated, " +
/usr/lib/python3.7/site-packages/torchvision/transforms/transforms.py:562: UserWarning: The use of the transforms.RandomSizedCrop transform is deprecated, please use transforms.RandomResizedCrop instead.
  warnings.warn("The use of the transforms.RandomSizedCrop transform is deprecated, " +
Accuracy:  0.5848
Number of prunning iterations to reduce 67% filters 5
Ranking filters.. 
Traceback (most recent call last):
  File "finetune.py", line 270, in <module>
    fine_tuner.prune()
  File "finetune.py", line 217, in prune
    prune_targets = self.get_candidates_to_prune(num_filters_to_prune_per_iteration)
  File "finetune.py", line 184, in get_candidates_to_prune
    self.train_epoch(rank_filters = True)
  File "finetune.py", line 179, in train_epoch
    self.train_batch(optimizer, batch.cuda(), label.cuda(), rank_filters)
  File "finetune.py", line 172, in train_batch
    self.criterion(output, Variable(label)).backward()
  File "/usr/lib/python3.7/site-packages/torch/tensor.py", line 96, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/usr/lib/python3.7/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: CUDA error: out of memory
[phung@archlinux pytorch-pruning]$

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Out of memory,when I pruned model test #19

Out of memory,when I pruned model test #19

Tianxiaomo commented May 15, 2018

cc94226 commented Jun 2, 2018

CodePlay2016 commented Jun 5, 2018

weixia1 commented Jul 13, 2018

buttercutter commented Oct 8, 2018 •

edited

Loading

Out of memory,when I pruned model test #19

Out of memory,when I pruned model test #19

Comments

Tianxiaomo commented May 15, 2018

cc94226 commented Jun 2, 2018

CodePlay2016 commented Jun 5, 2018

weixia1 commented Jul 13, 2018

buttercutter commented Oct 8, 2018 • edited Loading

buttercutter commented Oct 8, 2018 •

edited

Loading