Optimizing Length Compression in Large Reasoning Models

This repository contains the official implementation for the paper "Optimizing Length Compression in Large Reasoning Models".

Our work introduces LC-R1, a novel post-training method to compress the lengthy reasoning process of Large Reasoning Models (LRMs). By identifying and eliminating "invalid thinking"—redundant self-verification after a correct answer is found—LC-R1 significantly improves computational efficiency. Our two core principles, Brevity and Sufficiency, guide the model to produce concise yet complete reasoning.

On average, across multiple benchmarks and two model sizes (1.5B and 7B), LC-R1 achieves approximately 50% token reduction in the reasoning process with only a marginal ~2% drop in accuracy.

Below is the Pareto analysis showing the trade-off between reasoning length compression and accuracy. LC-R1 achieves a superior position on the frontier, indicating high compression with minimal performance degradation.

Figure 1: Pareto analysis of the Efficacy-Efficiency trade-off.

📝Contents

📝Contents
🚀News
📖Introduction
📊Performance
⚙️Implementation
🙏Acknowledgements
⭐Citation

🚀 News

[06/2025] We are excited to release the code, models, and datasets for the paper "Optimizing Length Compression in Large Reasoning Models"!

📖 Introduction

Large Reasoning Models (LRMs) have shown remarkable capabilities but often produce overly verbose and computationally expensive reasoning chains, a phenomenon we term "overthinking". A key issue is "invalid thinking," where models repeatedly double-check their work after deriving the correct answer.

To address this, we propose LC-R1, a post-training method based on Group Relative Policy Optimization (GRPO). LC-R1 uses a dual-reward system:

A Length Reward to encourage overall conciseness.
A Compress Reward to specifically penalize "invalid thinking" and terminate the reasoning process once the correct answer is found.

This approach effectively balances Brevity (eliminating redundancy) and Sufficiency (preserving essential reasoning steps). The diagram below illustrates our method's pipeline.

Figure 2: The LC-R1 training pipeline.

📊 Performance

We evaluated LC-R1 on two model sizes (DeepSeek-R1-Distill-Qwen-7B and 1.5B) across seven diverse reasoning benchmarks, including mathematics, general reasoning, and coding tasks.

The results demonstrate that LC-R1 consistently outperforms other compression methods. It provides the best trade-off between efficiency and efficacy, achieving substantial token reduction while maintaining competitive accuracy. Furthermore, LC-R1 models exhibit a significantly higher Valid Thinking (VT) rate (over 97%), confirming their ability to eliminate redundant reasoning effectively.

Below are the main results from our experiments.

Table 1: Main experimental results on accuracy and sequence length.

⚙️ Implementation

Follow these steps to set up the environment, download the necessary resources, and run the training scripts.

1. Installation

First, create and activate a conda environment. Then, clone the repository and install the required dependencies.

# Create and activate a new conda environment
conda create -n lcr1 python=3.10
conda activate lcr1
# Clone the repository
git clone https://github.com/zxiangx/LC-R1.git
cd LC-R1
# Install the project in editable mode and dependencies
pip install -e .

2. Download Resources

Run the provided script to download the models and datasets from Hugging Face Hub.

cd scripts
python pull_from_hub.py

3. Training

To start training, you need to configure the model and save paths in the shell scripts.

Open scripts/lcr1_7B.sh (or scripts/lcr1_1.5B.sh).
Modify the MODEL_PATH and SAVE_PATH variables to your actual paths.
- MODEL_PATH: Path to the pre-trained model you want to fine-tune.
- SAVE_PATH: Directory where the trained model and checkpoints will be saved.

Once configured, execute the script to begin training:

# For the 7B model
bash scripts/lcr1_7B.sh
# Or for the 1.5B model
bash scripts/lcr1_1.5B.sh

Training Outputs:

Model checkpoints will be saved to $SAVE_PATH/ckpt.
The final trained model parameters will be stored in $SAVE_PATH/model.
Training logs of vllm will be written to scripts/log/vllm.log.

Customizing Training:

To modify training parameters (e.g., learning rate, batch size), edit the scripts/train.py file.
To modify the parameters of deepspeed, edit the configuration files located in scripts/accelerate_configs/.

🙏 Acknowledgements

Our work leverages and benefits from the following excellent open-source projects. We express our sincere gratitude to their developers and contributors.

TRL is a flexible framework for RL training.
LLaMA-Factory is a reliable framework for tuning models.

⭐ Citation

If you find the content of this project helpful, please cite our paper as follows:

@misc{cheng2025optimizinglengthcompressionlarge,
      title={Optimizing Length Compression in Large Reasoning Models}, 
      author={Zhengxiang Cheng and Dongping Chen and Mingyang Fu and Tianyi Zhou},
      year={2025},
      eprint={2506.14755},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2506.14755}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
scripts		scripts
trl		trl
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Optimizing Length Compression in Large Reasoning Models

📝Contents

🚀 News

📖 Introduction

📊 Performance

⚙️ Implementation

1. Installation

2. Download Resources

3. Training

🙏 Acknowledgements

⭐ Citation

About

Uh oh!

Releases

Packages

Languages

zxiangx/LC-R1

Folders and files

Latest commit

History

Repository files navigation

Optimizing Length Compression in Large Reasoning Models

📝Contents

🚀 News

📖 Introduction

📊 Performance

⚙️ Implementation

1. Installation

2. Download Resources

3. Training

🙏 Acknowledgements

⭐ Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages