[TorchFX] INT8 Weights Compression Support #2891

anzr299 · 2024-08-19T17:25:14Z

Changes

Added weights compression implementation from template weights compression.
Modified graph builder for torch fx to include edge case where embedding node's weight was not being placed on the right port.
Updated the torch weights compression tests to include FX embedding metatype for reusability of some torch test functions in fx test.

Reason for changes

To support nncf.compress_weights() for Torch Fx models.

Tests

Added test at tests/torch/fx/test_compress_weights.py Reused the models and some tests from the torch implementation and included some extra checks such as the size of compressed model being lower than original model.

Performance:

tinyllama-1.1b-step-50k-105b Inference Speeds:

Torch Fx Compressed: 0.963s
Torch Fx Compiled with OV backend: 0.074s
Torch Fx, Compiled with OV backend and compressed: 0.04s
OV FP32: 0.079s
OV int8: 0.039s

Constraints

Currently only supports Torch FX representations extracted using the torch._export.capture_pre_autograd_graph(). #2987 outlines the request to support weights compression for FX models extracted using torch.export.export

Tickets:

#2938

TODO: Fix edge data for get_attr node

…raph to include post operator hook insertion for constant nodes

daniil-lyakhov

Great work! My first row of comments:

nncf/experimental/torch/fx/transformations.py

nncf/quantization/quantize_model.py

nncf/quantization/algorithms/weight_compression/torch_fx_backend.py

daniil-lyakhov · 2024-08-19T19:06:15Z

@anzr299, please add the conformance test for the Tinyllama-1.1b model, as it done in this pr
#2636

anzr299 · 2024-08-19T19:09:18Z

@anzr299, please add the conformance test for the Tinyllama-1.1b model, as it done in this pr #2636

Alright

…ilder

…g metatype 2. Modify constant update transformation builder to accept input port for the constant node. Default is set to 1

1. inference is performed correctly with compressed model 2. compressed model has same output shape as normal model 3. compressed model output is not very different from normal model

nncf/experimental/torch/fx/nncf_graph_builder.py

tests/torch/fx/test_model_transformer.py

…is_shared_attribute reference

daniil-lyakhov

Please fix reference files structure

tests/torch/fx/test_models.py

nncf/torch/graph/operator_metatypes.py

nncf/experimental/torch/fx/nncf_graph_builder.py

nncf/experimental/torch/fx/quantization/quantize_model.py

…on_transformation`

alexsu52

Good staff!

ljaljushkin · 2024-09-25T13:33:25Z

nncf/experimental/torch/fx/nncf_graph_builder.py

        for source_node in model.graph.nodes:
            node_type, node_metatype = GraphConverter._get_node_type_and_metatype(source_node, model)
+            node_metatype = GraphConverter._map_fx_unique_metatypes(source_node, node_metatype)
+            is_shared_node = False


Redundant line

Suggested change

is_shared_node = False

Oh, Alright. Done!

ljaljushkin · 2024-09-25T13:37:33Z

nncf/experimental/torch/fx/transformations.py

+def shared_constants_unification_transformation(model: torch.fx.GraphModule):
+    """
+    checks fx graph for shared constants, disconnects and eliminates redundant
+    shared constant while connecting singular shared constant.


Please elaborate a little more which problem this function solves.
From the current description it's not clear why some nodes should be disconnected and eliminated.
Maybe it's worth mentioning the issue with solver and min_max algorithms that they don't use is_shared attribute and so on.

Alright, I will elaborate the comment.

ljaljushkin · 2024-09-25T13:50:43Z

tests/torch/fx/test_compress_weights.py

+        {"gptq": True},
+        {"awq": True},
+        {"scale_estimation": True},
+        {"lora_correction": True},


subset_size and dataset are also not supported.
If #2978 will be merged before this PR, there's also a backup_precision parameter.

So in the case of backup_precision, I can leave it for now depending on if the PR is merged?

Following @alexsu52's guidance, the subset_size parameter is ignored when the dataset is None in the WeightsCompression Algorithm. For this reason, I didn't add a check for subset_size, but I did include a check for dataset.

ljaljushkin

I have only minor comments, overall LGTM

nncf/quantization/quantize_model.py

anzr299 added 10 commits August 14, 2024 19:31

weights compression init

297fdb4

Merge branch 'openvinotoolkit:develop' into fx_compress_weights

534e294

compression complete

06ca5a3

TODO: Fix edge data for get_attr node

Merge branch 'openvinotoolkit:develop' into fx_compress_weights

b4b2603

Modify graph builder to include support for embedding op

c770d2c

modify function to set new node meta for new module insertion to fx g…

70b00f9

…raph to include post operator hook insertion for constant nodes

Add weights compression support for torch fx

c7fa7f2

Add test for torch fx weights compression

667b8a5

reorder comments

dca2374

variable names fix

6f693c9

anzr299 requested a review from a team as a code owner August 19, 2024 17:25

github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch experimental NNCF PTQ Pull requests that updates NNCF PTQ labels Aug 19, 2024

daniil-lyakhov self-assigned this Aug 19, 2024

daniil-lyakhov self-requested a review August 19, 2024 17:46

daniil-lyakhov requested changes Aug 19, 2024

View reviewed changes

anzr299 added 2 commits August 19, 2024 22:47

Fix messages, use transformation for updating weight

159a615

Minor mypy fix

7a896d6

daniil-lyakhov reviewed Aug 19, 2024

View reviewed changes

nncf/quantization/algorithms/weight_compression/torch_fx_backend.py Outdated Show resolved Hide resolved

fix set_weight

0de1d9b

daniil-lyakhov self-requested a review August 19, 2024 19:05

anzr299 added 5 commits August 20, 2024 13:23

Update torch_fx_backend.py

f9e5d7c

Add embedding metatype for torch fx as a subtype

443dce7

replace embedding metatype with torch fx subtype in torch fx graph bu…

03d16f8

…ilder

1. Adjust the torch fx weights compression backend to use fx embeddin…

5226934

…g metatype 2. Modify constant update transformation builder to accept input port for the constant node. Default is set to 1

Update test for weight compression. Include test to see if

3cdb7b3

1. inference is performed correctly with compressed model 2. compressed model has same output shape as normal model 3. compressed model output is not very different from normal model

daniil-lyakhov reviewed Aug 20, 2024

View reviewed changes

nncf/experimental/torch/fx/nncf_graph_builder.py Outdated Show resolved Hide resolved

pre- commit fix

48d050b

daniil-lyakhov reviewed Sep 23, 2024

View reviewed changes

tests/torch/fx/test_model_transformer.py Outdated Show resolved Hide resolved

tests/torch/fx/test_model_transformer.py Outdated Show resolved Hide resolved

anzr299 added 3 commits September 23, 2024 18:32

add reference for checking shared constant unification transformation

3477d7c

Add synthetic model with embedding to test models and include create …

cbc2106

…is_shared_attribute reference

add reference graphs

229517c

daniil-lyakhov requested changes Sep 23, 2024

View reviewed changes

tests/torch/fx/test_models.py Outdated Show resolved Hide resolved

anzr299 added 4 commits September 24, 2024 10:44

Include assert in shared attribute test

fde56b7

Fix reference graphs structure

30ff3d2

pre-commit fix

f26a7a0

Merge branch 'openvinotoolkit:develop' into fx_compress_weights

1d0a866

daniil-lyakhov approved these changes Sep 24, 2024

View reviewed changes

alexsu52 reviewed Sep 24, 2024

View reviewed changes

nncf/experimental/torch/fx/quantization/quantize_model.py Show resolved Hide resolved

anzr299 added 2 commits September 24, 2024 15:48

Change FXEmbedding metatype to PTAtenEmbeddingMetatype

49d3dec

Move shared constants unification transformation to `apply_quantizati…

2e7e639

…on_transformation`

alexsu52 approved these changes Sep 25, 2024

View reviewed changes

Merge branch 'openvinotoolkit:develop' into fx_compress_weights

817c233

ljaljushkin reviewed Sep 25, 2024

View reviewed changes

ljaljushkin approved these changes Sep 25, 2024

View reviewed changes

Corrections, comments and refactoring

26a4ff4

alexsu52 reviewed Sep 26, 2024

View reviewed changes

nncf/quantization/quantize_model.py Outdated Show resolved Hide resolved

anzr299 added 3 commits September 26, 2024 14:25

Add seperate error message for dataset attribute

065bacb

fix comments

3942d45

Merge branch 'openvinotoolkit:develop' into fx_compress_weights

14096b7

alexsu52 merged commit c93676d into openvinotoolkit:develop Sep 26, 2024
13 checks passed

This was referenced Sep 30, 2024

[Torch FX] Add Special FX Metatype in Torch Operator Metatypes #2976

Draft

[TorchFX] Support NNCFGraph.is_shared method #2938

Open

alexsu52 mentioned this pull request Oct 9, 2024

[TorchFX] Torch FX/PyTorch 2 Export Quantization #2766

Open

1 task

MaximProshin changed the title ~~[TorchFX] Weights Compression Support~~ [TorchFX] INT8 Weights Compression Support Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TorchFX] INT8 Weights Compression Support #2891

[TorchFX] INT8 Weights Compression Support #2891

anzr299 commented Aug 19, 2024 •

edited by daniil-lyakhov

Loading

daniil-lyakhov left a comment

daniil-lyakhov commented Aug 19, 2024

anzr299 commented Aug 19, 2024

daniil-lyakhov left a comment

alexsu52 left a comment

ljaljushkin Sep 25, 2024

daniil-lyakhov Sep 26, 2024

anzr299 Sep 26, 2024

ljaljushkin Sep 25, 2024

anzr299 Sep 26, 2024

ljaljushkin Sep 25, 2024

anzr299 Sep 26, 2024 •

edited

Loading

anzr299 Sep 26, 2024

ljaljushkin left a comment

[TorchFX] INT8 Weights Compression Support #2891

[TorchFX] INT8 Weights Compression Support #2891

Conversation

anzr299 commented Aug 19, 2024 • edited by daniil-lyakhov Loading

Changes

Reason for changes

Tests

Performance:

Constraints

Tickets:

daniil-lyakhov left a comment

Choose a reason for hiding this comment

daniil-lyakhov commented Aug 19, 2024

anzr299 commented Aug 19, 2024

daniil-lyakhov left a comment

Choose a reason for hiding this comment

alexsu52 left a comment

Choose a reason for hiding this comment

ljaljushkin Sep 25, 2024

Choose a reason for hiding this comment

daniil-lyakhov Sep 26, 2024

Choose a reason for hiding this comment

anzr299 Sep 26, 2024

Choose a reason for hiding this comment

ljaljushkin Sep 25, 2024

Choose a reason for hiding this comment

anzr299 Sep 26, 2024

Choose a reason for hiding this comment

ljaljushkin Sep 25, 2024

Choose a reason for hiding this comment

anzr299 Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

anzr299 Sep 26, 2024

Choose a reason for hiding this comment

ljaljushkin left a comment

Choose a reason for hiding this comment

anzr299 commented Aug 19, 2024 •

edited by daniil-lyakhov

Loading

anzr299 Sep 26, 2024 •

edited

Loading