HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 301 server: GitHub.com content-type: text/html location: https://locuslab.github.io/safety-pretraining/ x-github-request-id: 4DAA:3157C7:9126FE:A2F6BB:6952BE8B accept-ranges: bytes age: 0 date: Mon, 29 Dec 2025 17:46:52 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210071-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767030413.653372,VS0,VE201 vary: Accept-Encoding x-fastly-request-id: 7b43175f8d36dde8d9339d152e22bc83a2579ef3 content-length: 162 HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Tue, 16 Sep 2025 13:59:53 GMT access-control-allow-origin: * etag: W/"68c96d59-2bb1" expires: Mon, 29 Dec 2025 17:56:52 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: 72C0:292AC1:926770:A437E9:6952BE8C accept-ranges: bytes date: Mon, 29 Dec 2025 17:46:53 GMT via: 1.1 varnish age: 0 x-served-by: cache-bom-vanm7210071-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1767030413.870712,VS0,VE217 vary: Accept-Encoding x-fastly-request-id: cab4a616d21d354deb11cb5b672735cd13efd9b0 content-length: 3273 Safety Pretraining: SafeLM Models, Data, Benchmarks

Safety Pretraining

Models Datasets Evaluation 📄 Paper

SafeLM

Home Models Datasets Evaluation Paper

Safety Pretraining: Toward the Next Generation of Safe AI

Pratyush Maini^* Sachin Goyal^* Dylan Sam^*
Alex Robey Yash Savani Yiding Jiang Andy Zou
Matt Fredrikson Zachary C. Lipton J. Zico Kolter

Carnegie Mellon University DatologyAI Center for AI Safety Gray Swan AI

* Equal contribution

Title figure illustrating Safety Pretraining overview

TL;DR: We embed safety directly into the pretraining pipeline with data‑centric interventions, delivering a 1.7B parameter model family that is natively safe. Everything (code, data & weights) is open‑source.

📄 Read the Paper 🔗 HuggingFace Hub

Models & Checkpoints

SafeLM‑1.7B

Natively-Safe Base Model

Download

SafeLM‑1.7B-Instruct

Instruction Tuned Model

Download

Safety Classifier

Lightweight embedding-based model to score the safety of web text.

Download

Datasets

SafeWeb

Context‑rich rewrites of harmful web text.

Download

RefuseWeb

A diverse dataset of web text repurposed into refusals to unsafe requests.

Download

Moral Education

Moral and educational lessons about safety.

Download

Evaluation & Code

Base Model Safety Benchmarks

Completion-style Safety Evaluations

Download

Training Code

Reproducible LitGPT recipes & configs.

Stay Tuned!

Made with ❤️ by the Safety Pretraining team · GitHub

HOME
ABOUT
AUCTIONS
SHIPPING
FEES
TOOLS
HOW
FAQ
CONTACT

Original Source | Taken Source