HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Fri, 05 Jan 2024 16:18:42 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"65982be2-61a2" expires: Mon, 29 Dec 2025 06:06:03 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: 6ABE:234FE9:869D87:972E81:695217F3 accept-ranges: bytes age: 0 date: Mon, 29 Dec 2025 05:56:04 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210036-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1766987764.821833,VS0,VE217 vary: Accept-Encoding x-fastly-request-id: 58a52621ecc21177e567edb172e676b43729430c content-length: 5949 RLHF-V

RLHF-V

Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Tianyu Yu¹ , Yuan Yao^2*, Haoye Zhang¹, Taiwen He¹, Yifeng Han¹, Ganqu Cui¹, Jinyi Hu¹,
Zhiyuan Liu^1*, Hai-Tao Zheng^1*, Maosong Sun¹ Tat-Seng Chua²

▶ 1. Tsinghua University
▶ 2. National University of Singapore

^*Correspondence

Paper Code Demo Dataset Model

Abstract

Existing Multimodal Large Language Models prevalently suffer from serious hallucination problems, generating text that is not factually grounded in associated images. Our RLHF-V framework enhances MLLM trustworthiness via behavior alignment from fine-grained correctional human feedback.

Fine-Grained and Diverse Human Preference Data: We collect 1.4K fine-grained human feedback consisting of 3.7k pieces of segment-level corrections, covering hallucination types including objects (41.2%), positions (20.3%), numbers (16.5%), attributes (10%), actions (5.3%), and others (6.8%).
High Data Efficiency and Scalability: With just 1.4K annotated data, we achieve a 34.8% reduction in model hallucinations. Moreover, the decrease in hallucinations becomes more significant as more data used.
Enhanced Performance and Computational Efficiency: Our fine-grained correctional human feedback data can better credit the desired behavior, allowing efficient training in 1 hour on 8 A100 GPUs to achieve promising results.
Outstanding Trustworthiness without Compromising Helpfulness: Our model surpasses existing open-source MLLMs in reducing hallucination rates, mitigates hallucination from over-generalization, and maintains informativeness. Surprisingly, RLHF-V is even more resistant to the over-generalization problem compared with GPT-4V.

Method

The proposed RLHF-V framework:

We collect 1.4k fine-grained dense feedback data by asking human annotators to correct the hallucinated segments in model responses. The training takes only 1 hour with 8 A100 GPUs to get RLHF-V-13B which is initialized from our RLHF-V_SFT-13B.

Highlights

Low hallucination rate while being informative:

Data-efficient and showing good scaling results:

More resistant to over-generalization:

BibTeX


@article{2023rlhf-v,
  author      = {Tianyu Yu and Yuan Yao and Haoye Zhang and Taiwen He and Yifeng Han and Ganqu Cui and Jinyi Hu and Zhiyuan Liu and Hai-Tao Zheng and Maosong Sun and Tat-Seng Chua},
  title       = {RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback},
  journal      = {arxiv},
  year         = {2023},
}

Examples

Short-form QA: RLHF-V can give a more trustworthy answer in short-form QA.

Long-form QA: RLHF-V can generate informative image description with less hallucinations.

Long-form QA: RLHF-V can provide detailed reasoning with less hallucinations.

Long-form QA: RLHF-V is more resistant to over-generalization.

Long-form QA: RLHF-V is more resistant to over-generalization.

HOME
ABOUT
AUCTIONS
SHIPPING
FEES
TOOLS
HOW
FAQ
CONTACT

Original Source | Taken Source