You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
conformal_llm_scores.py contains the python script for classification using 1-shot question prompts. It outputs three files
The softmax scores corresponding to each subjects for each of the 10 prompts
The accuracy for each subject prompt for mmlu-based 1-shot question as a dictionary where the key is the subject name and value is a list containing accuracy for each of the 10 prompts.
The accuracy for each subject prompt for gpt4-based 1-shot question as a dictionary where the key is the subject name and value is a list containing accuracy for each of the 10 prompts.
In conformal.ipynb, we have results for all conformal prediction experiments and gpt4 vs mmlu based prompt comparison. It requires the three files outputted by conformal_llm_scores.py to work. To run the experiment, download the llm_probs_gpt.zip file, unzip it and save it in your working directory and then run the conformal.ipynb file.
If you would like to run the experiments from scratch, apply for LLaMA access here and then use the hugging face version of LLaMA by converting original LLaMA weights to hugging face version refer here for instructions and then run the conformal_llm_scores.py script.