PANDA

This repo contains the codes for our work 🐼PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs.

The required package can be installed by running the following command.

pip install -r requirements.txt

ScienceWorld

Firstly, switch to the ScienceWorld workspace:

cd ScienceWorld

To directly run experiments with PANDA (You can also run PANDA from scratch by starting from step1).

./run_eval_react_panda.sh
./run_eval_reflexion_panda.sh
./run_eval_saycan_panda.sh

Step1: Gather expert trials to construct preferences data.

./gather_trials.sh

Step2: PANDA-Learning from the expert preferences.

./panda_learning.sh

Step3: Test with PANDA-Insight:

./run_eval_react_panda.sh
./run_eval_reflexion_panda.sh
./run_eval_saycan_panda.sh

TweetEval

Firstly, switch to the ScienceWorld workspace:

cd TweetEval

Step0: Download datasets file from cardifnlp/tweeteval and put it in the dataset folder and the expert models from cardifnlp/models and put it in the models folder.
Step1: Gather expert trials to construct preferences data.

./gather_trials.sh

Step2: PANDA-Learning from the expert preferences.

./panda_learning.sh

Step3: Test with PANDA-Insight:

./eval_gpt.sh
./eval_gpt_cot.sh

Acknowledgement

Our codes for scienceworld are adapted from yuchenlin/SwiftSage. Thanks for their kind open-sourced code.

Citation

If you find our project helpful to your research, please consider citing:

@inproceedings{liu2024panda,
  title={PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs},
  author={Liu, An and Yang, Zonghan and Zhang, Zhenhe and Hu, Qingyuan and Li, Peng and Yan, Ming and Zhang, Ji and Huang, Fei and Liu, Yang},
  booktitle={Findings of the Association for Computational Linguistics: ACL 2024},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

PANDA

ScienceWorld

TweetEval

Acknowledgement

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

PANDA

ScienceWorld

TweetEval

Acknowledgement

Citation