🦜 PARROT

Practical And Realistic BenchmaRk for crOss-system SQL Translation

The first comprehensive benchmark for evaluating cross-system SQL translation systems

Leaderboard • Documentation • Submit Results • Paper

📢 News

09/2025: Our paper "PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation" has been accepted by NeurIPS 2025! 🎉 🎉 🎉

05/2025: We have released PARROT-1.0 (28,003 translation pairs from 38 open-source benchmarks for extensive syntax testing) and published the leaderboard.

✨ Key Features

🎯 Comprehensive	🔧 Production-Ready	🧪 Well-Tested	🌐 Multi-Dialect
598 curated pairs from 38+ benchmarks	Real-world workloads & production data	Built-in validators & parsers	10+ SQL dialects supported

🌟 Why PARROT?

✅ 598 Translation Pairs from 38+ public benchmarks and production-derived workloads
🧠 Broad Dialect Coverage: PostgreSQL, MySQL, SQLite, Oracle, SQL Server, Db2, DuckDB, Trino, Hive, Snowflake, and more
🧪 Built-in Validators: Comprehensive parsers and executability checks for multiple engines
🛠️ Complete Toolkit: Preprocessing utilities and baseline translation tools included
📊 Rigorous Evaluation: Multi-dimensional scoring (syntax and execution)
🏆 Live Leaderboard: Track your progress and compete with the community

📤 Submissions

🏆 Ready to compete? Submit your system now!

Submission Process

📋 Prepare Outputs
- Follow the example in Submission_Example/20250928_LLMTranslator_ExampleTeam.zip
- Ensure proper folder structure and file formats
📖 Read Guidelines
- Review Submission_Example/PARROT Submission Guidelines.md
- Check format requirements and naming conventions
📝 Include System Description
- Approach and methodology
- Models and versions used
- Rules and heuristics applied
- Training data sources
- Compute resources
🚀 Submit
- Upload via the leaderboard site
- Wait for evaluation results

📋 Requirements Checklist

Consistent model versions and random seeds
Clear indication of supported dialect pairs
Valid UTF-8 text file outputs
Exact versions of LLM prompts/rule files included
System description document included
Reproducibility instructions provided

⚠️ Important: Include exact versions of all dependencies, prompts, and rule files for reproducibility.

🏁 Leaderboard Rules

Rule	Description
⏱️ Frequency	One submission per team per month (TBD)
📝 Transparency	Disclose all training data and public resources
🏷️ Documentation	Clearly mark manual rules or prompts
🚫 Fairness	No test set contamination or hand-tuning
✅ Verification	Results may be verified; additional materials may be requested

🧱 Baselines

We recommend to refer to an LLM-based baseline CrackSQL.

CrackSQL is a powerful SQL dialect translation tool that integrates rule-based strategies with LLMs for high accuracy. It enables seamless conversion between dialects (e.g., PostgreSQL → MySQL) with flexible access through Python API, command line, and web interface.

🧪 Task Definition

Goal: Translate SQL from one database dialect to another while preserving semantic equivalence.

Input:  (source_dialect, target_dialect, source_sql)
Output: target_sql

Example

-- Source (PostgreSQL)
SELECT EXTRACT(YEAR FROM created_at) AS year, COUNT(*) 
FROM users 
WHERE age > 25 
GROUP BY EXTRACT(YEAR FROM created_at);
-- Target (MySQL)
SELECT YEAR(created_at) AS year, COUNT(*) 
FROM users 
WHERE age > 25 
GROUP BY YEAR(created_at);

📊 Benchmark Statistics

Metric	Count
Translation Pairs	598
Source Benchmarks	38+
SQL Dialects	10+
Supported Engines	15+
Domain Types	Single & Cross-domain

📦 Benchmark Contents

PARROT/
├── 📁 benchmark/          # Source datasets from 38+ benchmarks
│   ├── Spider/           # Cross-domain SQL queries
│   ├── SParC/            # Multi-turn conversations
│   ├── BIRD/             # Complex real-world queries
│   ├── TPC-H FROID/      # UDF-heavy workloads
│   └── ...               # 34+ more benchmarks
├── 🔍 validator/         # Grammar parsers & validators
│   ├── pg_parser/        # PostgreSQL parser
│   ├── mysql_parser/     # MySQL parser
│   ├── oracle_parser/    # Oracle parser
│   └── ...               # 10+ more dialect parsers
├── ⚙️ processor/         # Preprocessing utilities
├── 🔄 translator/        # Baseline translation tools
└── 📤 Submission_Example/ # Submission templates

Supported Benchmarks

View all 38+ benchmarks

Benchmark	Year	SQL Dialects	Language	Domain Type	Turn Round	Collection
ATIS	1994	SQLite, MySQL	English	Single-domain	Single	Manual
GeoQuery	1996	MySQL, SQLite	English	Single-domain	Single	Manual
Restaurants	2000	SQLite	English	Single-domain	Single	Manual
Academic	2014	Unspecified	English	Single-domain	Single	Manual
IMDb	2017	Unspecified	English	Single-domain	Single	Manual
Yelp	2017	Unspecified	English	Single-domain	Single	Manual
Scholar	2017	Unspecified	English	Single-domain	Single	Manual
WikiSQL	2017	SQLite	English	Cross-domain	Single	Manual
Advising	2018	SQLite, MySQL	English	Single-domain	Single	Manual
Spider	2018	SQLite	English	Cross-domain	Single	Manual
SParC	2019	SQLite	English	Cross-domain	Multiple	Manual
CoSQL	2019	SQLite	English	Cross-domain	Multiple	Manual
CSpider	2019	SQLite	Chinese	Cross-domain	Single	Manual
MIMICSQL	2020	SQLite	English	Single-domain	Single	Hybrid†
SQUALL	2020	SQLite	English	Cross-domain	Single	Manual
FIBEN	2020	IBM Db2, PostgreSQL	English	Single-domain	Single	Manual
ViText2SQL	2020	General SQL	Vietnamese	Cross-domain	Single	Manual
DuSQL	2020	Unspecified	Chinese	Cross-domain	Single	Hybrid†
PortugueseSpider	2021	SQLite	Portuguese	Cross-domain	Single	Hybrid†
CHASE	2021	SQLite	Chinese	Cross-domain	Multiple	Manual
Spider-Syn	2021	SQLite	English	Cross-domain	Single	Manual
Spider-DK	2021	SQLite	English	Cross-domain	Single	Manual
Spider-Realistic	2021	SQLite	English	Cross-domain	Single	Manual
KaggleDBQA	2021	SQLite	English	Cross-domain	Single	Manual
SEDE	2021	T-SQL	English	Single-domain	Single	Manual
MT-TEQL	2021	SQLite	English	Cross-domain	Single	Automatic
PAUQ	2022	SQLite	Russian	Cross-domain	Single	Manual
knowSQL	2022	Unspecified	Chinese	Cross-domain	Single	Manual
Dr.Spider	2023	SQLite	English	Cross-domain	Single	Hybrid†
BIRD	2023	SQLite	English	Cross-domain	Single	Manual
AmbiQT	2023	SQLite	English	Cross-domain	Single	LLM-aided
ScienceBenchmark	2024	General SQL	English	Single-domain	Single	Hybrid†
BookSQL	2024	SQLite	English	Single-domain	Single	Manual
Archer	2024	SQLite	English/ Chinese	Cross-domain	Single	Manual
BULL	2024	SQLite	English/ Chinese	Single-domain	Single	Manual
Spider2	2024	SQLite, DuckDB, PostgreSQL	English	Cross-domain	Single	Manual
TPC-H FROID	2018	T-SQL, PostgreSQL	English	Cross-domain	Single	Hybrid†
DSB	2021	T-SQL, PostgreSQL	English	Decision Support	Single	Hybrid†
TPC-DS	2005	T-SQL, PostgreSQL	English	Decision Support	Single	Hybrid†
SQL-ProcBench	2021	SQL Server, PostgreSQL, IBM Db2	English	Single-domain	Single	Production-derived

† Hybrid means the dataset was created using both automatic generation and manual annotation.

🧮 Evaluation & Scoring

PARROT evaluates systems across four key dimensions:

Dimension	Description
🔍 Syntax Validity	Can the SQL be parsed by the target dialect?
⚡ Execution Checks	Result equivalence when data available

📚 Citation

If you use PARROT in your research, please cite:

@inproceedings{zhou2025parrot,
  author       = {Wei Zhou and Guoliang Li and Haoyu Wang and Yuxing Han and Xufei Wu and Fan Wu and Xuanhe Zhou},
  title        = {PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation},
  booktitle    = {Advances in Neural Information Processing Systems (NeurIPS)},
  year         = {2025}
}
@article{zhou2025cracksql,
  author       = {Wei Zhou and Yuyang Gao and Xuanhe Zhou and Guoliang Li},
  title        = {Cracking SQL Barriers: An LLM-based Dialect Translation System},
  journal      = {Proceedings of the ACM on Management of Data},
  volume       = {3},
  number       = {3 (SIGMOD)},
  year         = {2025}
}
@article{zhou2025cracksqldemo,
  author       = {Wei Zhou and Yuyang Gao and Xuanhe Zhou and Guoliang Li},
  title        = {CrackSQL: A Hybrid SQL Dialect Translation System Powered by Large Language Models},
  journal      = {arXiv Preprint},
  url          = {https://arxiv.org/abs/2504.00882},
  year         = {2025}
}

📄 License

This project is released under the MIT License. See LICENSE file for details.

📬 Contact & Support

Questions? Feedback? Want to submit?

📧 Email: weizhoudb@sjtu.edu.cn

💬 Contributions: Issues and PRs are welcome!

🙏 Acknowledgments

Made with ❤️ by

Shanghai Jiao Tong University • Tsinghua University • Bytedance Team

⭐ Star us on GitHub if you find this project useful!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Submission_Example		Submission_Example
benchmark		benchmark
processor		processor
translator		translator
validator		validator
.gitignore		.gitignore
README-UDF.md		README-UDF.md
README.md		README.md
parrot-poster-white.png		parrot-poster-white.png
parrot-website-logo.png		parrot-website-logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🦜 PARROT

📢 News

✨ Key Features

🌟 Why PARROT?

📤 Submissions

🏆 Ready to compete? Submit your system now!

Submission Process

📋 Requirements Checklist

🏁 Leaderboard Rules

🧱 Baselines

🧪 Task Definition

Example

📊 Benchmark Statistics

📦 Benchmark Contents

Supported Benchmarks

🧮 Evaluation & Scoring

📚 Citation

📄 License

📬 Contact & Support

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

weAIDB/PARROT

Folders and files

Latest commit

History

Repository files navigation

🦜 PARROT

📢 News

✨ Key Features

🌟 Why PARROT?

📤 Submissions

🏆 Ready to compete? Submit your system now!

Submission Process

📋 Requirements Checklist

🏁 Leaderboard Rules

🧱 Baselines

🧪 Task Definition

Example

📊 Benchmark Statistics

📦 Benchmark Contents

Supported Benchmarks

🧮 Evaluation & Scoring

📚 Citation

📄 License

📬 Contact & Support

🙏 Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages