Evaluation Code Produces Identical Results with Different Caching Methods #17

mohsenhariri · 2024-07-11T18:17:18Z

Title: Evaluation Code Produces Identical Results with Different Caching Methods

Description:

It seems the evaluation code leads to the same result with different caching methods. I used these models:

mistralai/Mistral-7B-v0.1
mistralai/Mistral-7B-Instruct-v0.2

with 3 different caching methods: --compress_method KCVT, --compress_method GEAR, and --compress_method KIVI_V2. In all cases, the result is:

Model	KIVI, accuracy
Mistral-7B-v0.1	0.4245640636846095
Mistral-7B-Instruct-v0.2	0.4761182714177407

Steps to Reproduce:

Run the evaluation script with the models mistralai/Mistral-7B-v0.1 and mistralai/Mistral-7B-Instruct-v0.2.
Use the following caching methods: --compress_method KCVT, --compress_method GEAR, and --compress_method KIVI_V2.
Observe the identical results in KIVI accuracy across all methods.

Expected Behavior:
Different caching methods should produce varying results in KIVI accuracy.

Additional Information:
I checked the input arguments and the evaluation script reads them correctly, so I am sure I had different setups.

The text was updated successfully, but these errors were encountered:

HaoKang-Timmy · 2024-07-16T19:55:52Z

Let me check that.

HaoKang-Timmy · 2024-07-18T19:29:32Z

Currently this version of code does not support Mistral yet. However you can try it with Llama3 and Llama2.
Support of Mistral would be added soon.

CUHKSZzxy · 2024-07-25T08:53:51Z

Currently this version of code does not support Mistral yet. However you can try it with Llama3 and Llama2. Support of Mistral would be added soon.

Does this mean the current version is not ready to reproduce the GEAR on Mistral models, as reported in the paper draft? If this is not the case, could you provide some suggestions since I failed to find related shell scripts.

Thanks!

@HaoKang-Timmy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation Code Produces Identical Results with Different Caching Methods #17

Evaluation Code Produces Identical Results with Different Caching Methods #17

mohsenhariri commented Jul 11, 2024

HaoKang-Timmy commented Jul 16, 2024

HaoKang-Timmy commented Jul 18, 2024

CUHKSZzxy commented Jul 25, 2024 •

edited

Loading

Evaluation Code Produces Identical Results with Different Caching Methods #17

Evaluation Code Produces Identical Results with Different Caching Methods #17

Comments

mohsenhariri commented Jul 11, 2024

HaoKang-Timmy commented Jul 16, 2024

HaoKang-Timmy commented Jul 18, 2024

CUHKSZzxy commented Jul 25, 2024 • edited Loading

CUHKSZzxy commented Jul 25, 2024 •

edited

Loading