You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To generate reviews for answers from all pairs of models, please run the gen_{reviewer}_all.sh. For example,
./gen_claude_all.sh
Peer Ranking
To run peer ranking, please open the peer_ranking.ipynb file by any Jupyter Notebook.
Peer Discussion
Please enter the peer_discussion folder by the following command.
cd peer_discussion/
Before running any python script, please make sure the file config.yml contains correct configurations you need.
Reviews Generation
python review_lfqa.py
There is no codes of generating reviews for Vicuna80 since they are provided in the Peer Rank related codes.
Discussion Generation
# discuss on LFQA
python gather_all_lfqa.py
python discuss_lfqa.py
# discuss on Vicuna80
python gather_all_vicuna80.py
python discuss_vicuna80.py
Citation
Please cite the following if find our work helpful.
@misc{li2023prd,
title={PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations},
author={Ruosen Li and Teerth Patel and Xinya Du},
year={2023},
eprint={2307.02762},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Contact
Following 2 options are available for any clarification, comments or suggestions