SciSafeEval

Usage

Clone this repo from remote to local:

git clone --recursive https://github.com/DavidLee528/SciSafeEval.git

Code Structure

The SciSafeEval code is organized into three primary components:

Generator: Interfaces with target large language models (LLMs) to generate outputs. (code/generator.py)
Probe: Utilizes the SciSafeEval dataset to probe the LLMs. (code/probe.py)
Detector: Evaluates whether the attack is successful based on the output from the target LLMs. (code/detector.py)

These components are integrated within the main.py file, which serves as the central script coordinating their interactions.

Citation

@article{li2024scisafeeval,
  title={SciSafeEval: A Comprehensive Benchmark for Safety Alignment of Large Language Models in Scientific Tasks},
  author={Li, Tianhao and Lu, Jingyu and Chu, Chuangxin and Zeng, Tianyu and Zheng, Yujia and Li, Mei and Huang, Haotian and Wu, Bin and Liu, Zuoxian and Ma, Kai and others},
  journal={arXiv preprint arXiv:2410.03769},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.vscode		.vscode
code		code
dataset		dataset
garak @ 6c799df		garak @ 6c799df
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SciSafeEval

Usage

Code Structure

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

DavidLee528/SciSafeEval

Folders and files

Latest commit

History

Repository files navigation

SciSafeEval

Usage

Code Structure

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages