Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added resume training args to run_training #11

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ashirviskas
Copy link

A complete version of his PR: #6

mobiuscreek added a commit to mobiuscreek/coronahack-stylegan that referenced this pull request May 3, 2020
@ahmedshingaly
Copy link

is there any way to train custom dataset on 1 GPU of RTX 2080 Ti ?
it always shows error before it reach ticks > ERROR CUDA RUN OUT OF MEMORY
I am not sure how to reduce batch size?
I am using 1024x1024 pictures
thank you in advance

@ashirviskas
Copy link
Author

@ahmedshingaly

Try lowering these numbers in the file directly:

https://github.com/NVlabs/stylegan2/blob/master/run_training.py#L54

I haven't tried it, but this may work.

@ahmedshingaly
Copy link

thank you very much, I will try that
another question
my model produce shape with (1, 12, 512) in (https://github.com/NVlabs/stylegan)
but when I use stylegan encoder (https://github.com/Puzer/stylegan-encoder) to find latent space it requires (1, 18, 512), do you have any idea how can I produce (1, 18, 512) model shapes instead of (1, 12, 512)?

@ashirviskas
Copy link
Author

This is a repository for StyleGAN2 and the encoder you linked is for the original StyleGAN, probably that's why they have different shapes.

For this purpose I was using a pytorch implementation from here: https://github.com/Tetratrio/stylegan2_pytorch

and just made it save the latent space into a numpy file using these two lines in run_projector.py, project_images function:
dlatent = proj.get_dlatent().cpu().numpy() np.save(os.path.join(args.output, name_prefix[i + k] + 'final_dlatent.npy'), dlatent)

(Sorry if formatting is messed up, posting from mobile)

@ahmedshingaly
Copy link

Dear @ashirviskas ,
thank you very much, I will try it now. appreciate your effort
Best Regards
Ahmed Khairadeen

@YukiSakuma
Copy link

YukiSakuma commented May 30, 2020

Can anyone help? Since google colab has removed the free RAM upgrade to 25GB, I am not able to train my model. I am training using a pretrained model that was trained on 512x512 images but it couldn't get past the 1st iteration it always go over the RAM limit (12.72GB) crashing the notebook, I tried adjusting

sched.minibatch_size_base = 32
sched.minibatch_gpu_base = 4

to no success including setting both to a value of 2.

@ricshaw
Copy link

ricshaw commented Jun 29, 2020

Does stylegan2 allow us to condition on the class label? And can I generate samples for a particular class?

@JanineCHEN
Copy link

Does stylegan2 allow us to condition on the class label? And can I generate samples for a particular class?

Having same question here. Any insights would be greatly appreciated/

@vsemecky
Copy link

Can anyone help? Since google colab has removed the free RAM upgrade to 25GB.

You have a few options:

  1. Use Colab Pro, which is not free, but price is very low.
  2. Use fork of StyleGan2 by Skyflynil, which is not memory intensive as original StyleGan2: https://github.com/skyflynil/stylegan2

@jasuriy
Copy link

jasuriy commented Jun 19, 2024

@ahmedshingaly hi
were you succeed to train the model on custom dataset?
I have a question if you could help ?
thank you

@jasuriy
Copy link

jasuriy commented Jun 20, 2024

hi dear @ashirviskas @ahmedshingaly @YukiSakuma @ricshaw @JanineCHEN
As I see you guys all have some expertise on my issue with training the stylegan2 with custom dataset. Could you please give some help on this issue :
Error: dataset root directory does not exist.
root@fvt:/workspace/stylegan2# python3 run_training.py --num-gpus=4 --data-dir=/workspace/stylegan2/datasets/my-custom-dataset --con
fig=config-f --dataset=my-custom-dataset --mirror-augment=true
Local submit - run_dir: results/00002-stylegan2-my-custom-dataset-4gpu-config-f
dnnlib: Running training.training_loop.training_loop() on localhost...
Streaming data using training.dataset.TFRecordDataset...
Traceback (most recent call last):
File "run_training.py", line 192, in
main()
File "run_training.py", line 187, in main
run(**vars(args))
File "run_training.py", line 120, in run
dnnlib.submit_run(**kwargs)
File "/workspace/stylegan2/dnnlib/submission/submit.py", line 343, in submit_run
return farm.submit(submit_config, host_run_dir)
File "/workspace/stylegan2/dnnlib/submission/internal/local.py", line 22, in submit
return run_wrapper(submit_config)
File "/workspace/stylegan2/dnnlib/submission/submit.py", line 280, in run_wrapper
run_func_obj(**submit_config.run_func_kwargs)
File "/workspace/stylegan2/training/training_loop.py", line 141, in training_loop
training_set = dataset.load_dataset(data_dir=dnnlib.convert_path(data_dir), verbose=True, **dataset_args)
File "/workspace/stylegan2/training/dataset.py", line 192, in load_dataset
dataset = dnnlib.util.get_obj_by_name(class_name)(**kwargs)
File "/workspace/stylegan2/training/dataset.py", line 53, in init
assert os.path.isdir(self.tfrecord_dir)
AssertionError

I am using these:
pre-trained ffhq network: stylegan2-ffhq-config-f.pkl
custom dataset: ~5,000 png 1024x1024 (converted to tfrecords)
GPU support: 4 high NVIDIA GPUS : NVIDIA RTX A6000
running: python3 run_training.py --num-gpus=4 --data-dir=/workspace/stylegan2/datasets/ffhq --config=config-f --dataset=ffhq --mirror-augment=true

I would really appreciate if you could help on this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants