refactor benchmark backend code #540

qjiang002 · 2022-11-30T05:39:53Z

This PR aims to fix issue #533

Previously, benchmark_db_utils needs to load SystemInfo which depends on SDK versioning. However, since current benchmark only needs System properties such as overall results for the benchmark plots or tables, we can refactor the code to avoid computing and storing additional SystemInfo in cache which will expire due to SDK upgrading.

neubig

Looks great, thanks!

Could you please check if this page works after this fix has been applied? https://dev.explainaboard.inspiredco.ai/benchmark?id=globalbench_ner

qjiang002 · 2022-12-02T04:00:05Z

Hi @neubig , I got this error when loading this benchmark:

[0]   File "/Users/jiangqi/Desktop/Capstone/explainaboard_web/backend/src/gen/explainaboard_web/impl/db_utils/benchmark_db_utils.py", line 187, in <setcomp>
[0]     (x.dataset.dataset_name, x.dataset.sub_dataset, x.dataset.split)
[0] AttributeError: 'NoneType' object has no attribute 'dataset_name'

This is because this benchmark try to find all ner systems with 'system_query': {'task_name': 'named-entity-recognition'}, but there are systems with undefined/custom dataset, so their dataset is None.

One way to deal with undefined datasets is to ignore undefined datasets in benchmark. I think we cannot merge systems with undefined datasets because they may be for different tasks and have different metrics. WDYT?

lyuyangh

Thanks! I am not familiar with this code but the changes look good! I just noticed one small thing regarding the type annotation.

lyuyangh · 2022-12-05T03:38:12Z

backend/src/impl/db_utils/benchmark_db_utils.py

            ):
-                sys_info = unwrap(sys_info_tmp)
+                sys = unwrap(system_tmp)


According to the type annotation, it seems that this unwrap is not necessary.

qjiang002 · 2022-12-05T05:01:33Z

Hi @neubig , I got this error when loading this benchmark:
[0]   File "/Users/jiangqi/Desktop/Capstone/explainaboard_web/backend/src/gen/explainaboard_web/impl/db_utils/benchmark_db_utils.py", line 187, in <setcomp>
[0]     (x.dataset.dataset_name, x.dataset.sub_dataset, x.dataset.split)
[0] AttributeError: 'NoneType' object has no attribute 'dataset_name'
This is because this benchmark try to find all ner systems with 'system_query': {'task_name': 'named-entity-recognition'}, but there are systems with undefined/custom dataset, so their dataset is None.

One way to deal with undefined datasets is to ignore undefined datasets in benchmark. I think we cannot merge systems with undefined datasets because they may be for different tasks and have different metrics. WDYT?

This is another issue not related to this refactor PR. I'll merge this PR and record this problem in another issue.

refactor benchmark backend code

fbcfc08

qjiang002 requested review from neubig and lyuyangh as code owners November 30, 2022 05:39

neubig approved these changes Dec 1, 2022

View reviewed changes

lyuyangh approved these changes Dec 5, 2022

View reviewed changes

remove unwrap

9a5be1e

qjiang002 merged commit 73a4a89 into main Dec 5, 2022

qjiang002 deleted the refactor-benchmark-backend branch December 5, 2022 05:02

qjiang002 mentioned this pull request Dec 5, 2022

Benchmark containing undefined datasets cannot open #553

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor benchmark backend code #540

refactor benchmark backend code #540

qjiang002 commented Nov 30, 2022

neubig left a comment •

edited

Loading

qjiang002 commented Dec 2, 2022 •

edited

Loading

lyuyangh left a comment

lyuyangh Dec 5, 2022

qjiang002 Dec 5, 2022

qjiang002 commented Dec 5, 2022

refactor benchmark backend code #540

refactor benchmark backend code #540

Conversation

qjiang002 commented Nov 30, 2022

neubig left a comment • edited Loading

Choose a reason for hiding this comment

qjiang002 commented Dec 2, 2022 • edited Loading

lyuyangh left a comment

Choose a reason for hiding this comment

lyuyangh Dec 5, 2022

Choose a reason for hiding this comment

qjiang002 Dec 5, 2022

Choose a reason for hiding this comment

qjiang002 commented Dec 5, 2022

neubig left a comment •

edited

Loading

qjiang002 commented Dec 2, 2022 •

edited

Loading