style: format

antvis · Nov 22, 2024 · 3acac4b · 3acac4b
1 parent c322b56
commit 3acac4b
Show file tree

Hide file tree

Showing 32 changed files with 9,233 additions and 10,546 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -12,6 +12,9 @@ on:
     paths:
       - 'src/**'
       - '__tests__/**'
+      - 'evaluations/**'
+      - 'knowledges/**'
+      - 'docs/**'
 
 permissions:
   contents: read # to fetch code (actions/checkout)

diff --git a/.prettierignore b/.prettierignore
@@ -18,7 +18,5 @@ dist/**
 
 # evaluations
 evaluations/datastes/recommend/gpt_vis_train.json
-evaluations/datastes/recommend/evalResult.json
-evaluations/datastes/recommend/eval-result.json
-evaluations/datastes/recommend/eval_result.json
+evaluations/datastes/recommend/metrics.json
 
diff --git a/README.md b/README.md
@@ -138,6 +138,10 @@ The purpose of the [Visual Knowledge Base](https://github.com/antvis/GPT-Vis/tre
 
 Note: The numbers in the format of X/Y represent the metrics of the respective chart types when evaluated against the dataset.
 
+## 🤖 Chart Recommendation Dataset
+
+The chart recommendation dataset is designed to evaluate or fine-tune large language models on their ability to recommend chart types based on given data. The dataset currently encompasses 16 types of charts, with 1-3 different data scenarios per chart type, and more than 15 chart data instances for each scenario. The dataset is continuously updated, and we welcome contributions of chart data collected from your own use cases. For more detailed information about the dataset, please visit [evaluations/recommend](https://github.com/antvis/GPT-Vis/tree/main/evaluations/recommend/README.md).
+
 ## 💻 Development
 
 ```bash
@@ -151,9 +155,6 @@ $ pnpm dev
 $ pnpm build
 ```
 
-## 🤖 Chart Recommendation Dataset
-The chart recommendation dataset is designed to evaluate or fine-tune large language models on their ability to recommend chart types based on given data. The dataset currently encompasses 16 types of charts, with 1-3 different data scenarios per chart type, and more than 15 chart data instances for each scenario. The dataset is continuously updated, and we welcome contributions of chart data collected from your own use cases. For more detailed information about the dataset, please visit [evaluations/recommend](https://github.com/antvis/GPT-Vis/tree/main/evaluations/recommend/README.md).
-
 ## License
 
 [MIT](./LICENSE)
diff --git a/README.zh-CN.md b/README.zh-CN.md
@@ -137,6 +137,7 @@ set_gpt_vis(content)
 |               |                         |                      |               |                      |                 |         |
 
 ## 🤖 图表模型推荐数据集
+
 图表推荐数据集用于评测/微调大模型在“给定数据，推荐图表类型”任务上的能力。数据集目前涵盖了 16 种图表类型，每种图表类型下 1-3 个不同数据场景，每个场景下 15+ 个图表数据。数据会持续更新，也欢迎向我们贡献你的使用场景中收集的图表数据。数据集详细信息见 [evaluations/recommend](https://github.com/antvis/GPT-Vis/tree/main/evaluations/recommend/README.md)
 
 ## 💻 本地开发

diff --git a/evaluations/datastes/recommend/README.en.md b/evaluations/datastes/recommend/README.en.md
@@ -1,10 +1,13 @@
 # Chart Recommendation Dataset
+
 This folder provides datasets for evaluating/fine-tuning large models on the "Chart Recommendation" task. Here, the chart recommendation task refers to recommending suitable charts to display data based on the given data.
 
 Currently, the dataset covers 16 types of charts, with 1-3 distinct data scenarios for each chart type. Each scenario contains 15+ sets of chart data. The data will be continually updated, and we welcome contributions of chart data collected from your use scenarios.
 
 ## Dataset Description
+
 ### Original Chart Data
+
 Data is organized into different folders based on chart types, and chart data is divided into different files based on data scenarios. For example, the column chart folder column contains two files: 01_base.json for basic column chart data, and 02_split.json for split column chart data.
 
 In each data entry, source represents the user input. source.data contains the original data, and source.meta includes metadata about the input data, mainly field names and data types. One of these pieces of information can be missing. target represents the recommended chart type and the field mappings in the chart configuration. Here is an example:
@@ -30,12 +33,15 @@ In each data entry, source represents the user input. source.data contains the o
 ```
 
 ### Model Fine-Tuning Dataset
+
 The gpt_vis_train.jsonl file is a fine-tuning training dataset generated from the above original chart data. The generation strategy is as follows: randomly select half of the cases for each chart type (the remaining data is used for evaluation). Since the number of original data entries varies for each chart type, to avoid imbalanced chart quantities affecting recommendation results, some chart data entries are repeated a certain number of times to ensure there are 60 entries for each chart type in the training set.
 
 ### Evaluation Result File
+
 The `metrics.json` file contains the results of our model evaluation after fine-tuning. In this file, every source entry is the original input, target is the expected output, and generation is the model's output. Comparing these entries allows the evaluation of recommendation accuracy.
 
 ## Model's Performance on Chart Recommendation Task
+
 Using the above datasets, we achieved a chart type accuracy of 89% and an encode accuracy of 82% with fine-tuning based on the `qwen2.5-14b-instruct`.
 
 It is important to note that the model recommendations can satisfy the requirement of "providing data and returning chart and configuration" in most scenarios. However, the model's output is not entirely controlled, which may result in invalid output or charts that cannot be successfully rendered. We recommend combining these with the recommendation modules in [@antv/ava](https://ava.antv.antgroup.com/api/advice/advisor). In scenarios where the model performance is suboptimal or where traditional rules fulfill the recommendation requirements, rule-based recommendation pipelines can be used as a fallback.
diff --git a/evaluations/datastes/recommend/README.md b/evaluations/datastes/recommend/README.md
@@ -7,36 +7,48 @@
 ## 数据集说明
 
 ### 原始图表数据
+
 数据按照图表类型进行图表类型分为不同文件夹，图表数据按照数据场景划分为不同文件。例如柱形图 `column` 包含两个文件 `01_base.json` 为基础柱形图数据，`02_split.json` 为带拆分的柱形图。
 
 每条数据中 `source` 为用户输入，`source.data` 为输入的原始数据，`source.meta` 为输入的数据的元信息，主要是字段名和字段数值类型，推荐时两个信息可以缺少其中之一。`target` 为推荐的图表类型和图表配置中的字段映射。数据示例：
+
 ```json
 {
-    "source": {
-        // 数据字段元信息
-        "metas": [{"name": "城市", "dataType": "string"},{"name": "人口","dataType": "number"}],
-        // 原始数据
-        "data": [{"城市": "北京","人口": 2154},{"城市": "上海","人口": 2424},{"城市": "广州","人口": 1530}]
-    },
-    "target": [
-        {
-            "type": "column",
-            "encode": {
-                "x": ["城市"], // x 轴字段
-                "y": ["人口"] // y 轴字段
-            }
-        }
+  "source": {
+    // 数据字段元信息
+    "metas": [
+      { "name": "城市", "dataType": "string" },
+      { "name": "人口", "dataType": "number" }
+    ],
+    // 原始数据
+    "data": [
+      { "城市": "北京", "人口": 2154 },
+      { "城市": "上海", "人口": 2424 },
+      { "城市": "广州", "人口": 1530 }
     ]
+  },
+  "target": [
+    {
+      "type": "column",
+      "encode": {
+        "x": ["城市"], // x 轴字段
+        "y": ["人口"] // y 轴字段
+      }
+    }
+  ]
 }
 ```
 
 ### 模型微调数据集
+
 `gpt_vis_train.jsonl` 文件是我们使用上述原始图表数据集生成的微调训练数据集。生成策略如下：每种图表随机抽取一半的 case （剩余数据用作评测），由于每种图表原始数据条数不同，为避免图表数量不均衡影响推荐结果，通过将部分图表数据重复一定次数，保证每种图表在训练集中有 60 条数据。
 
 ### 评测结果文件
+
 `metrics.json` 文件是我们执行模型微调后，用模型评测的结果，其中每条数据的 `source` 为原始输入，`target` 为期望输出，`generation` 为模型输出，`correctness` 为图表类型是否推荐正确，`encodeScore` 为图表配置推荐结果的打分。评测指标的计算参考 `eval/eval-recommend.js` 文件。
 
 ## 模型推荐图表效果说明
+
 我们使用上述数据集，基于 `qwen2.5-14b-instruct` 微调后的图表类型准确率可达 89%，`encode` 准确率达 82%。
 
 需要注意的是，模型推荐在大部分场景下能够满足“给出数据，返回图表及配置”的需求，但模型输出不完全可控，存在输出结果不合法、输出的图表无法绘制成功等情况。推荐结合 [@antv/ava](https://ava.antv.antgroup.com/api/advice/advisor) 中的推荐模块使用，在模型效果不佳或者传统规则已满足推荐需求的场景下，可以使用规则推荐的工程链路进行兜底。