Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dl/conv layer attrs update #23

Conversation

daniil-lyakhov
Copy link
Owner

Changes

Reason for changes

Related tickets

Tests

@daniil-lyakhov daniil-lyakhov force-pushed the dl/conv_layer_attrs_update branch 2 times, most recently from 5437420 to bb403da Compare August 24, 2023 13:44
@daniil-lyakhov daniil-lyakhov force-pushed the dl/channel_alignment_improvements_full branch 3 times, most recently from 1eb303f to dd84e18 Compare August 25, 2023 11:19
@daniil-lyakhov daniil-lyakhov force-pushed the dl/conv_layer_attrs_update branch from bb403da to 9da200c Compare August 25, 2023 12:12
openvinotoolkit#2073)

### Changes

* ChannelAlignment algorithm is enabled by default
* Biases are added only for operations that are affected by CA algorithm

### Reason for changes

* To increase models mertics by using ChannelAlignment algorithm by
default

### Related tickets

114328
114583

### Tests

tests/post_training/test_templates/test_channel_alignment.py is updated
@daniil-lyakhov daniil-lyakhov force-pushed the dl/conv_layer_attrs_update branch 2 times, most recently from 300c01a to 669f0a9 Compare September 8, 2023 12:43
daniil-lyakhov and others added 18 commits September 8, 2023 14:47
Refactor smooth quant to use weights layout

Tests
### Changes

- Added new operation - GroupNormalization

### Reason for changes

- Performance degradations that are caused by not correct quantization
scheme
- New operation support

### Related tickets

- 119821
- 119335

### Tests

- TBD
### Changes

Disable MaskRCNN and RetinaNet graph tests until ticket 119664 is
resolved.

The following tests are now excluded:

test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_t_a_sym_t-retinanet]

test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_t_a_sym_t-mask_rcnn]

test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_ch_a_asym_t-retinanet]

test_compressed_graph.py::TestModelsGraph::test_quantize_network[w_sym_ch_a_asym_t-mask_rcnn]

test_compressed_graph.py::TestModelsGraph::test_magnitude_sparsity_network[retinanet]

test_compressed_graph.py::TestModelsGraph::test_magnitude_sparsity_network[mask_rcnn]

test_compressed_graph.py::TestModelsGraph::test_rb_sparsity_network[retinanet]

test_compressed_graph.py::TestModelsGraph::test_rb_sparsity_network[mask_rcnn]

test_compressed_graph.py::TestModelsGraph::test_pruning_network[retinanet]

test_compressed_graph.py::test_quantize_outputs[w_sym_t_a_sym_t-retinanet]

test_compressed_graph.py::test_quantize_outputs[w_sym_ch_a_asym_t-retinanet]
…toolkit#2118)

### Changes

Add message about deprecation of `export_to_onnx_standard_ops` option in
NNCFConfig

### Reason 

Recommended way to export to onnx with QuantizeLinear-DequantizeLinear
node pairs is `nncf.strip(quantized_model)`.
openvinotoolkit#2115)

### Changes

NNCF should not quantize GRU ops with linear_before_reset set to true,
since oneDNN does not support it yet

### Reason for changes

To align with POT

### Related bug

openvinotoolkit#2105

### Tests

Added `test_ignore_nodes_by_attribues` for OV backend
…oolkit#2123)

### Changes

Added Whisper notebook to the list of quantization samples
### Changes

- Add marks `nightly` and `weakly` for tests.
- Mark sanity tests as `nightly`
- Split `test_functions.TestParametrized` to fast for precommit and long
for nightly
- Time of torch precommit reduced from 60 to 40 mins
- Set `xfail` for sanity tests with `--mode train` in case of segment
fault.
Sporadic segment fault reproduced on torch>=2.0.0 on call `backward`
function.

### Related tickets

119128
### Changes

Added the link to Quantization with accuracy control using NNCF
notebooks.

### Reason for changes

Customer adoption

### Related tickets

N/A

### Tests

N/A
### Changes

Fixed problem with shared weights in compression.

### Reason for changes

Problem with some LLMs with shared weights.

### Related tickets


### Tests
…t#2086)

### Changes

- Add support for the `dump_intermediate_model` parameter to save fully
quantized model in the AAQ pipeline

### Reason for changes

- Alignment with POT

### Related tickets

N/A

### Tests

N/A
### Changes

1. Fixed an issue with wrong `tqdm` bar length in the case when
calibration dataset length is less than `subset_size`.
Reproducer:
nikita-savelyevv@f0951c1
**Before:**
`Statistics collection: 34%|██████ | 101/300 [00:03<00:06, 28.66it/s]`
**After:**
When dataset has `__len__`:
`Statistics collection: 100%|██████████████████| 101/101 [00:03<00:00,
28.20it/s]`
When dataset doesn't have `__len__`:
`Statistics collection: 34%|██████ | 101/300 [00:03<00:06, 29.45it/s]`

2. Improved progress bar GUI when ran from notebooks.
**Before:**
<img width="704" alt="Screenshot 2023-09-06 091857"
src="https://github.com/openvinotoolkit/nncf/assets/23343961/9851cb8d-00f1-4297-af50-14697e86e961">

or (in some browsers progress bar takes up multiple lines):


![image](https://github.com/openvinotoolkit/nncf/assets/23343961/99fa9629-2869-4d8f-872e-97ef59bc092e)
**After:**
<img width="706" alt="Screenshot 2023-09-06 105453"
src="https://github.com/openvinotoolkit/nncf/assets/23343961/58e75cc9-2507-4c5b-8c3c-cac44eefcb79">

In console the progress bar is the same.

### Reason for changes

User experience improvement.

### Related tickets

112627

### Tests

<!--- How was the correctness of changes tested and whether new tests
were added -->
### Changes

Upgrade ultralytics to 8.0.170

### Reason for changes

For some reason yolo samples started to fail. Upgrading ultralytics
solves this issue because the later version contains these changes:
ultralytics/ultralytics@a741961

### Related tickets

120311

### Tests

Build 82 passed
### Changes

Removal of upper bounds from `scipy` version.

### Reason for changes

 - `scipy<1.11.1` has security vulnerability (see ticket)
- The upper bound is causing pip conflicts in
openvinotoolkit/openvino#19458

### Related tickets

117438
### Changes

- Fixed behaviour in the `calibrate.py` for algos without options

### Reason for changes

- Bugfix

### Related tickets

- 120295

### Tests
…m properly by make command (openvinotoolkit#2127)

### Changes

All tests from `tests/experimental/{backend}/` are moved to directories
`tests/{backed}/experimental`

### Reason for changes

To enable this tests when make command is called. This tests are not
running in precommit on current develop branch
kshpv and others added 25 commits November 6, 2023 07:36
)

### Changes

Make StatisticsAggreagtor keep the original tensor share after
aggregation.

### Reason for changes

To add support of correct handling statistics in case batch_size > 1.

### Related tickets

121650

### Tests

All tests are updated accordingly
### Changes

Skip cuda test if cuda is not available

### Reason for changes

To fix CPU pre-commit

### Tests
precommit_torch_cpu/169/ is finished successfully
### Changes

<!--- What was changed (briefly), how to reproduce (if applicable), what
the reviewers should focus on -->

### Reason for changes

<!--- Why should the change be applied -->

### Related tickets

117723

### Tests

<!--- How was the correctness of changes tested and whether new tests
were added -->
### Changes
Extends `ModelInputInfo` mechanism used to specify inputs to
`NNCFNetwork` for graph building/exporting - now the input info can be
specified either as `FillerInputInfo`, which functions pretty much the
same as before and uses NNCF config file as the source of specification
for the input tensors, or as `ExactInputInfo`, which allows to specify
exact forward arguments for graph building. The latter is used to build
the model graph based on outputs of dataloaders attached to `NNCFConfig`
in the QAT API if the "input_info" field is not specified in
`NNCFConfig`, and also in the PTQ API flow to build the graph based on
the output of the calibration dataset.

### Reason for changes
Previously the PTQ API had to specify own `wrap_inputs_fn`,
`wrap_outputs_fn`, `dummy_forward_fn` to make NNCFNetwork build its
graph based on the outputs of the calibration dataloader - these
functions had to be mostly copy-pasted from the QAT approach to preserve
basic NNCF PT functionality such as traced tensor expiry, same tensor
replication etc. The new approach allows code reuse. Also the QAT use
cases where the init dataloaders are specified are made easier since
"input_info" fields in the NNCFConfig may now be omitted.

### Related tickets
N/A

### Tests

tests.torch.test_graph_building.test_input_info_args_are_passed_into_forward
tests.torch.test_graph_building.test_filler_input_info_arg_generation

tests.torch.test_graph_building.test_compressed_model_creation_can_build_exact_input_infos_from_dataloader_in_config

tests.torch.ptq.test_quantize_model_helpers.test_create_nncf_network_with_nncf_dataset
### Changes

- Updated SmoothQuant algorithm to work with Convolution layers;

### Reason for changes

- Better accuracy results in some cases;

### Related tickets

- 113591

### Tests

---------

Co-authored-by: Liubov Talamanova <[email protected]>
### Changes

Added `Concat` to `MULTIHEAD_ATTENTION_OUTPUT` ignored pattern for OV,
ONNX, Torch backends

### Reason for changes

To improve accuracy of https://huggingface.co/EleutherAI/gpt-neo-1.3B
model

### Related tickets

* 117617
### Changes

- Added new files for 2023.2 scale references (only layer names were
changed) instead of the symlinks;
- Changed layer names for existing 2023.2 references;

### Reason for changes

- Alignment with the newest OV version
### Changes
As stated in the title

### Reason for changes
PTQ PT CUDA test cases fail

### Related tickets
124679

### Tests
test_input_infos_respect_device_setting
…t#2250)

### Changes
Fixed a regression introduced in openvinotoolkit#2196 for the object detection samples
and bumped the `datasets` version for the movement sparsity tests to fix
a `Loading a dataset cached in a LocalFileSystem is not supported` error
in the associated test cases.

### Reason for changes
Torch nightly tests fail otherwise.

### Related tickets
N/A

### Tests
torch_nightly
### Changes

Allow the use of an external weight importance information for
reordering weights of the super-network.

Adds missing info in experimental schema for previously committed KD. 

### Reason for changes

Several advanced algorithms can produce weight importance information
that outperform L1/L2 weight reordering strategies. This PR allows the
use of external weight importance information to reorder the weights in
the super-network.

### Related tickets

N/A

### Tests

Tests have been included.

---------

Co-authored-by: Yuan Jinjie <[email protected]>
…vinotoolkit#2246)

### Changes

- Do not filter constant nodes for torch backend in the inference graph
- Fix version in requarements.txt for examples of
post_training_quantization
- for ssd300_vgg16 is not available to use torch 2.1.0 (failed on export
to onnx Unsupported: ONNX export of operator get_pool_ceil_padding,
tracing is not supporting too)
    - Update metrics
- Add to PTEngine convert inputs to model's device to sync behavior with
`create_compress_model`
- Mobilenet_v2 example converting PyTorch model to IR by tracing
(without onnx).
- nncf.quantize for PyTorch works with copy of the target model

### Reason for changes

To make PTQ work properly with disconnected graphs (like in
[example](https://github.com/openvinotoolkit/nncf/blob/develop/examples/post_training_quantization/torch/ssd300_vgg16/main.py))

### Related tickets
124417

### Tests

test_examples build 128

---------

Co-authored-by: Alexander Dokuchaev <[email protected]>
…inotoolkit#2220)

### Changes
As stated in the title

### Reason for changes
This doesn't seem obvious to some developers, so will state this in the style guide.

### Related tickets
N/A

### Tests
N/A
### Changes

Introduced `nncf.torch.wrap_model(model: torch.nn.Module, example_input:
Any) -> NNCFNetwork`

### Reason for changes

Making it easier to obtain `NNCFNetwork`.

### Related tickets

N/A

### Tests

test_wrap_model.py
### Changes
Networkx was updated to allow 3.1, pyparsing limitation was removed.
Will now replace the disallowed colon symbols `:` during reads and
writes of .dot graphs.

### Reason for changes
OV is now at the networkx 3.1, and we should be aligned at least on the
major version for better DX.

### Related tickets
69520

### Tests
Existing graph-checking tests
…penvinotoolkit#2253)

### Changes

Supports multi-device model inference and wrapped forward functions

### Reason for changes

Support tracing "bigscience/bloomz-560m" model from HF

### Related tickets

N/A

### Tests

test_no_self_forward,  test_multidevice_model
### Changes

Use built-in `tmp_path` for temporary files to fix NAS tests on Windows

### Reason for changes

The PR (openvinotoolkit#2234) introduced a new test which fails on Windows with error:

`PermissionError: [Errno 13] Permission denied: 'C:\\Users\\SYS_K8~1\\AppData\\Local\\Temp\\tmpmf1i25nd'`


### Related tickets

124904

### Tests

NAS tests on Windows
### Changes
Allow torchvision 0.16 in the examples

### Reason for changes
Otherwise the installation of the requirements for the torch examples
tries to install torchvision 0.16, which pulls the torch 2.0.1 which is
different from the BKC torch v2.1

### Related tickets
N/A

### Tests
torch_nightly, torch E2E
)

### Changes

Exclude from weight compression nodes that has more than one reduction
axes

### Reason for changes

There's only one model that has multiple reduction axes.
It's `chatglm` with one embedding layer having [8132,32,2] shape. It was
decided to not quantize this layer, since it would save just 6Mb in 4Gb
model in case of int8 quantization with risk to reduce accuracy, and it
can't be quantized group-wise.

The idea is to switch to multiple reduction axes when it will be really
needed.

### Related tickets

n/a

### Tests

Tested on 104 models from share with IR's for llm models. In all cases
except chatglm there's a single reduction axis.
### Changes

Remove logic to set device in `PTEngine`, to support multi-device model
openvinotoolkit#2253
### Changes

renamed name to node_name in the warning

### Reason for changes

chatglm model support

### Related tickets

125045

### Tests

test_not_quantize_with_multiple_reduction_axes
@daniil-lyakhov daniil-lyakhov force-pushed the dl/conv_layer_attrs_update branch 3 times, most recently from 3263f53 to bdeb0c5 Compare November 15, 2023 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.