Skip to content

Official code and data for "CLOMO: Counterfactual Logical Modification with Large Language Models" (ACL 2024)

Notifications You must be signed in to change notification settings

Eleanor-H/CLOMO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CLOMO: Counterfactual Logical Modification with Large Language Models

Official repository for ACL 2024 paper "CLOMO: Counterfactual Logical Modification with Large Language Models".

For more details, please refer to the project page: https://clomo-logic.github.io/.

[Webpage] [Paper] [Dataset] [Examples]

About CLOMO

In the Counterfactual Logical Modification (CLOMO) task, a model is given a pair of Argument and Premise 1 in the relation R, and then is given an additional Premise 2 that perturbs R. The model is required to modify Argument to Argument' such that R stands in the Argument'-Premise 2 pair.


We thus introduce the CLOMO dataset with 1,000 high-quality and challenging questions in four logical relations. The data is collected by multi-turn human annotation and verification.

Four categories of logical restrictions in CLOMO.


Additionally, we introduce a Self-Evaluation Score (SES) for the logically consistent generation in CLOMO. SES decomposes the evaluation into several LLMs basic discrimination tasks, which is demonstrated comparable with human evaluation.

$$s = c(r|p_1, a) \times c(r|p_2, a') - c(r|p_2, a) \times c(r|p_2, a')$$

Decomposed SES evaluation tasks.

Dataset

The overall CLOMO dataset can be downloaded in data/. We release CLOMO data with three prompting setups:

Setup Train Dev Test
Plain CoT cot_train.json cot_dev.json cot_test.json
Few-shot few_train.json few_dev.json few_test.json
Zero-shot zero_train.json zero_dev.json zero_test.json
#Sample 600 200 200

We also release the exclusive and picked-out subsets we used for the ablation study on unseen logical relation in Table 6 in the Paper.

  • The exclusive subsets w/o R: data/type_excluded/*_xR_train.json. Each subset contains 373 samples.

  • The picked-out subsets of R: data/type_picked/*_oR_test.json. Each subset size is included in the following table.

Setup R=NA R=SA R=S R=W
Plain CoT
train w/o R cot_xna_train.json cot_xsa_train.json cot_xs_train.json cot_xw_train.json
test on R cot_ona_test.json cot_osa_test.json cot_os_test.json cot_ow_test.json
#test on R 79 14 35 72
Few-shot
train w/o R few_xna_train.json few_xsa_train.json few_xs_train.json few_xw_train.json
test on R few_ona_test.json few_osa_test.json few_os_test.json few_ow_test.json
#test on R 79 14 35 72
Zero-shot
train w/o R zero_xna_train.json zero_xsa_train.json zero_xs_train.json zero_xw_train.json
test on R zero_ona_test.json zero_osa_test.json zero_os_test.json zero_ow_test.json
#test on R 79 14 35 72

** Remark: NA: Necessary Assumption; SA: Sufficient Assumption; S: Strengthen; W: Weaken.

Inference

Output Format

Please refer to the following template to prepare your result json file for SES evaluation: template_pred.json.

Run Inference

First, make sure you have installed all requirements:

pip install -r requirements.txt

For inference using API key, run:

cd inference_only
python LLMer_inference.py --call_type api \
--model_name MODEL_NAME --api_key API_KEY \
--data_path PATH_TO_TEST_JSON_FILE \
--save_path PATH_TO_SAVE_RESULTS

For inference using local LLM, run:

cd inference_only
CUDA_VISIBLE_DEVICES=0 python LLMer_inference.py --call_type llm \
--model_name MODEL_NAME --local_dir LOCAL_DIR \
--data_path PATH_TO_TEST_JSON_FILE \
--save_path PATH_TO_SAVE_RESULTS

Additionally, a sample script of experiments on small LMs in Table 7 in the Paper is provided: LLMer_inference.sh.

Fine-tuning with CLOMO

Coming soon.

Evaluation with SES Score

First, prepare your result json file in the format as template_pred.json.

Then, run:

python SES/ses.py \
--model_pred_file PATH_TO_PRED_JSON_FILE \
--api_model LLM_NAME \
--api_key API_KEY \
--api_org API_ORG

Cite

If you find CLOMO useful, please kindly cite:

@article{huang2023clomo,
 author = {Huang, Yinya and Hong, Ruixin and Zhang, Hongming and Shao, Wei and Yang, Zhicheng and Yu, Dong and Zhang, Changshui and Liang, Xiaodan and Song, Linqi},
 journal = {The 62nd Annual Meeting of the Association for Computational Linguistics},
 title = {{CLOMO}: Counterfactual Logical Modification with Large Language Models},
 booktitle = {The 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)},
 year = {2024}
}

About

Official code and data for "CLOMO: Counterfactual Logical Modification with Large Language Models" (ACL 2024)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published