You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The SciSafeEval code is organized into three primary components:
Generator: Interfaces with target large language models (LLMs) to generate outputs. (code/generator.py)
Probe: Utilizes the SciSafeEval dataset to probe the LLMs. (code/probe.py)
Detector: Evaluates whether the attack is successful based on the output from the target LLMs. (code/detector.py)
These components are integrated within the main.py file, which serves as the central script coordinating their interactions.
Citation
@article{li2024scisafeeval,
title={SciSafeEval: A Comprehensive Benchmark for Safety Alignment of Large Language Models in Scientific Tasks},
author={Li, Tianhao and Lu, Jingyu and Chu, Chuangxin and Zeng, Tianyu and Zheng, Yujia and Li, Mei and Huang, Haotian and Wu, Bin and Liu, Zuoxian and Ma, Kai and others},
journal={arXiv preprint arXiv:2410.03769},
year={2024}
}
About
🧬 Official repository of SciSafeEval, a comprehensive safety benchmark for scientific large language models.