-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use general way to access and update submodules #371
Conversation
[ghstack-poisoned]
ghstack-source-id: 9372cca2b62078993fa7cd8f28d3b5bb710b406b Pull Request resolved: #371
[ghstack-poisoned]
ghstack-source-id: 6eff83d7fe5af576dc6da0dcdae5bc51b4ac8ec4 Pull Request resolved: #371
@@ -369,7 +369,7 @@ def parallelize_llama(model, world_mesh, parallel_dims, job_config: JobConfig): | |||
# apply AC + torch.compile | |||
ac_config = job_config.activation_checkpoint | |||
enable_compile = job_config.training.compile | |||
for layer_id, transformer_block in model.layers.items(): | |||
for layer_id, transformer_block in model.layers.named_children(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any ordering gaurantee of named_children though?
i think this will work only because the names are assumed to be '0', '1', ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, does order even matter here? i was thinking more about the model's forward code, but that code doesn't get affected by this.
This PR fixes the issue mentioned [here](pytorch/pytorch#126653 (comment)): "Module object has no attributed items." The reason is, a split `ModuleDict` is no longer a `ModuleDict`. (Future support is not guaranteed.) It would be more generally applicable if we use `named_children()` and `register_module()` to access and update submodules. [ghstack-poisoned]
ghstack-source-id: ba1d77e5825a26632fe9b7509a88b44509cac45f Pull Request resolved: #371
Stack from ghstack (oldest at bottom):
This PR fixes the issue mentioned here:
"Module object has no attributed items."
The reason is, a split
ModuleDict
is no longer aModuleDict
.It would be more generally applicable if we use
named_children()
andregister_module()
to access and update submodules.