HTTP/2 200
date: Mon, 29 Dec 2025 11:42:55 GMT
content-type: text/html; charset=utf-8
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With
etag: W/"8b6b87c0641766e5d433049fc465b5c4"
cache-control: max-age=0, private, must-revalidate
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin
content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com github.githubassets.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com wss://alive-staging.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com marketplace-screenshots.githubusercontent.com/ copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com github.githubassets.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/
server: github.com
content-encoding: gzip
accept-ranges: bytes
set-cookie: _gh_sess=y48cRdwIjw0NnYrdnjww0gW1W3Midsfq1VJmb4Ly%2B3BVUMTdgM4x1ubiogLvq9x2FiMrfLJRsWCbYtVN0yv5srHnYMbOhprHST%2FzYqX9u0jjRcZl1uUjmSuFgqWQKZ4wFPa5F%2B5SiqehE6gQRcYP%2F36WiEgFccz69pRHDd9MLNbBhD5EEWhN9zQRcjTzUs0lt7jxmSHgOFpYbiPXdXz9%2ByVNVjqfRjmc4xEITeH0H4CBTWI2np1VbU8lMo5tZhaVNVLTScHhVavJUHLkyPcjqA%3D%3D--1ZXTlvvPDks%2B5cwE--sfsJswqs6UJBUtq3kcbbRQ%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax
set-cookie: _octo=GH1.1.2126948488.1767008574; Path=/; Domain=github.com; Expires=Tue, 29 Dec 2026 11:42:54 GMT; Secure; SameSite=Lax
set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Tue, 29 Dec 2026 11:42:54 GMT; HttpOnly; Secure; SameSite=Lax
x-github-request-id: 9B0A:21D903:6033080:738103E:6952693E
PKU-Alignment · GitHub
PKU-Alignment
Loves Sharing and Open-Source, Making AI Safer.
Large language models (LLMs) have immense potential in the field of general intelligence but come with significant risks. As a research team at Peking University, we actively focus on alignment techniques for LLMs, such as safety alignment, to enhance the model's safety and reduce toxicity.
Welcome to follow our AI Safety project:
Pinned
Loading
JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.
Python
1k
146
NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Python
528
75
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Python
1.6k
128
NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms
Python
388
56
Repositories
Showing 10 of 25 repositories
VLA-Arena
Public
VLA-Arena is an open-source benchmark for systematic evaluation of Vision-Language-Action (VLA) models.
PKU-Alignment/VLA-Arena’s past year of commit activity
Python
79
Apache-2.0
2
3
0
Updated Dec 22, 2025
SafeVLA
Public
[NeurIPS 2025 Spotlight] Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning.
PKU-Alignment/SafeVLA’s past year of commit activity
Python
98
7
0
0
Updated Dec 19, 2025
safety-gymnasium
Public
NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
PKU-Alignment/safety-gymnasium’s past year of commit activity
Python
528
Apache-2.0
75
12
2
Updated Dec 4, 2025
align-anything
Public
Align Anything: Training All-modality Model with Feedback
PKU-Alignment/align-anything’s past year of commit activity
safe-rlhf
Public
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
PKU-Alignment/safe-rlhf’s past year of commit activity
PKU-Alignment/MM-DeceptionBench’s past year of commit activity
0
0
0
0
Updated Sep 25, 2025
PKU-Alignment/eval-anything’s past year of commit activity
Python
21
Apache-2.0
18
1
2
Updated Jul 26, 2025
PKU-Alignment/llms-resist-alignment’s past year of commit activity
Python
40
1
0
0
Updated Jun 11, 2025
SAE-V
Public
[ICML 2025 Poster] SAE-V: Interpreting Multimodal Models for Enhanced Alignment
PKU-Alignment/SAE-V’s past year of commit activity
11
0
0
0
Updated Jun 5, 2025
ProgressGym
Public
Alignment with a millennium of moral progress. Spotlight@NeurIPS 2024 Track on Datasets and Benchmarks.
PKU-Alignment/ProgressGym’s past year of commit activity
Python
24
MIT
4
0
0
Updated Mar 30, 2025
Most used topics
Loading…
You can’t perform that action at this time.