Carview!

HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 date: Sat, 11 Oct 2025 19:40:30 GMT content-type: text/html; charset=utf-8 nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800} server: cloudflare last-modified: Sat, 11 Oct 2025 06:54:09 GMT access-control-allow-origin: * expires: Sat, 11 Oct 2025 19:50:30 GMT cache-control: max-age=600 report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=uyWIcDhsrWn1t6pU1ae7kuM3Y1%2BPlngi%2FBtlxraDS80QFV8YWHHbxV3xOtTWPt8RjklgJzepWueuFVtCBp3lvc2TVRk%2BFc%2FG1A%3D%3D"}]} x-proxy-cache: MISS x-github-request-id: 77FC:37600C:C076B:E9B67:68EAB2A3 age: 0 via: 1.1 varnish x-served-by: cache-bom-vanm7210048-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1760211630.087399,VS0,VE288 vary: Accept-Encoding x-fastly-request-id: 9e65b0725ad6b193c09530bc92b4ccc15b5ad753 cf-cache-status: DYNAMIC content-encoding: gzip cf-ray: 98d0d45fb85cc1f7-BLR alt-svc: h3=":443"; ma=86400 File Formats – DuckDB

⌘+k ctrl+k

1.4 (stable)

1.5-dev
1.4
1.3
1.2
1.1
1.0
0.10

Search Shortcut cmd + k | ctrl + k

Installation
Documentation

Getting Started
Connect

Overview
Concurrency

Data Import

Overview
Data Sources
CSV Files

Overview
Auto Detection
Reading Faulty CSV Files
Tips

JSON Files

Overview
Creating JSON
Loading JSON
Writing JSON
JSON Type
JSON Functions
Format Settings
Installing and Loading
SQL to / from JSON
Caveats

Multiple Files

Overview
Combining Schemas

Parquet Files

Overview
Metadata
Encryption
Tips

Partitioning

Hive Partitioning
Partitioned Writes

Appender
INSERT Statements

Client APIs

Overview
Tertiary Clients
ADBC
C

Overview
Startup
Configuration
Query
Data Chunks
Vectors
Values
Types
Prepared Statements
Appender
Table Functions
Replacement Scans
API Reference

Overview
Arguments
Dot Commands
Output Formats
Editing
Safe Mode
Autocomplete
Syntax Highlighting
Known Issues

Dart
Go
Java (JDBC)
Julia
Node.js (Deprecated)

Overview
API Reference

Node.js (Neo)

Overview

ODBC

Overview
Linux Setup
Windows Setup
macOS Setup
Configuration

PHP
Python

Overview
Data Ingestion
Conversion between DuckDB and Python
DB API
Relational API
Function API
Types API
Expression API
Spark API
API Reference
Known Python Issues

R
Rust
Swift
Wasm

Overview
Deploying DuckDB-Wasm
Instantiation
Data Ingestion
Query
Extensions

Introduction
Statements

Overview
ANALYZE
ALTER DATABASE
ALTER TABLE
ALTER VIEW
ATTACH and DETACH
CALL
CHECKPOINT
COMMENT ON
COPY
CREATE INDEX
CREATE MACRO
CREATE SCHEMA
CREATE SECRET
CREATE SEQUENCE
CREATE TABLE
CREATE VIEW
CREATE TYPE
DELETE
DESCRIBE
DROP
EXPORT and IMPORT DATABASE
INSERT
LOAD / INSTALL
MERGE INTO
PIVOT
Profiling
SELECT
SET / RESET
SET VARIABLE
SUMMARIZE
Transaction Management
UNPIVOT
UPDATE
USE
VACUUM

Query Syntax

SELECT
FROM and JOIN
WHERE
GROUP BY
GROUPING SETS
HAVING
ORDER BY
LIMIT and OFFSET
SAMPLE
Unnesting
WITH
WINDOW
QUALIFY
VALUES
FILTER
Set Operations
Prepared Statements

Data Types

Overview
Array
Bitstring
Blob
Boolean
Date
Enum
Interval
List
Literal Types
Map
NULL Values
Numeric
Struct
Text
Time
Timestamp
Time Zones
Union
Typecasting

Expressions

Overview
CASE Expression
Casting
Collations
Comparisons
IN Operator
Logical Operators
Star Expression
Subqueries
TRY

Functions

Overview
Aggregate Functions
Array Functions
Bitstring Functions
Blob Functions
Date Format Functions
Date Functions
Date Part Functions
Enum Functions
Interval Functions
Lambda Functions
List Functions
Map Functions
Nested Functions
Numeric Functions
Pattern Matching
Regular Expressions
Struct Functions
Text Functions
Time Functions
Timestamp Functions
Timestamp with Time Zone Functions
Union Functions
Utility Functions
Window Functions

Constraints
Indexes
Meta Queries

Information Schema
Metadata Functions

DuckDB's SQL Dialect

Overview
Indexing
Friendly SQL
Keywords and Identifiers
Order Preservation
PostgreSQL Compatibility
SQL Quirks

Samples

Configuration

Overview
Pragmas
Secrets Manager

Extensions

Overview
Installing Extensions
Advanced Installation Methods
Distributing Extensions
Versioning of Extensions
Troubleshooting of Extensions

Core Extensions

Overview
AutoComplete
Avro
AWS
Azure
Delta
DuckLake
Encodings
Excel
Full Text Search
httpfs (HTTP and S3)

Overview
HTTP(S) Support
Hugging Face
S3 API Support
Legacy Authentication Scheme for S3 API

Iceberg

Overview
Iceberg REST Catalogs
Amazon S3 Tables
Amazon SageMaker Lakehouse (AWS Glue)
Troubleshooting

ICU
inet
jemalloc
MySQL
PostgreSQL
Spatial

Overview
Function Reference
R-Tree Indexes
GDAL Integration

SQLite
TPC-DS
TPC-H
UI
VSS

Guides

Overview
Data Viewers

Tableau
CLI Charting with YouPlot

Database Integration

Overview
MySQL Import
PostgreSQL Import
SQLite Import

File Formats

Overview
CSV Import
CSV Export
Directly Reading Files
Excel Import
Excel Export
JSON Import
JSON Export
Parquet Import
Parquet Export
Querying Parquet Files
File Access with the file: Protocol

Network and Cloud Storage

Overview
HTTP Parquet Import
S3 Parquet Import
S3 Parquet Export
S3 Iceberg Import
S3 Express One
GCS Import
Cloudflare R2 Import
DuckDB over HTTPS / S3
Fastly Object Storage Import

Meta Queries

Describe Table
EXPLAIN: Inspect Query Plans
EXPLAIN ANALYZE: Profile Queries
List Tables
Summarize
DuckDB Environment

ODBC

ODBC Guide

Performance

Overview
Environment
Import
Schema
Indexing
Join Operations
File Formats
How to Tune Workloads
My Workload Is Slow
Benchmarks
Working with Huge Databases

Python

Installation
Executing SQL
Jupyter Notebooks
marimo Notebooks
SQL on Pandas
Import from Pandas
Export to Pandas
Import from Numpy
Export to Numpy
SQL on Arrow
Import from Arrow
Export to Arrow
Relational API on Pandas
Multiple Python Threads
Integration with Ibis
Integration with Polars
Using fsspec Filesystems

SQL Editors

DBeaver SQL IDE

SQL Features

AsOf Join
Full-Text Search
query and query_table Functions
Timestamp Issues

Snippets

Creating Synthetic Data
Dutch Railway Datasets
Sharing Macros
Analyzing a Git Repository
Importing Duckbox Tables
Copying an In-Memory Database to a File

Troubleshooting

Crashes
Out of Memory Errors

Glossary of Terms
Browsing Offline

Operations Manual

Overview
DuckDB's Footprint

Files Created by DuckDB
Gitignore for DuckDB
Reclaiming Space

Logging

Overview

Securing DuckDB

Overview
Embedding DuckDB
Securing Extensions

Non-Deterministic Behavior
Limits
DuckDB Docker Container

Development

DuckDB Repositories
Profiling
Building DuckDB

Overview
Build Configuration
Building Extensions
Android
Linux
macOS
Raspberry Pi
Windows
Python
R
Troubleshooting
Unofficial and Unsupported Platforms

Benchmark Suite
Testing

Overview
sqllogictest Introduction
Writing Tests
Debugging
Result Verification
Persistent Testing
Loops
Multiple Connections
Catch

Internals

Overview
Storage Versions and Format
Execution Format
Pivot

Sitemap
Live Demo

Documentation / Guides / Performance

File Formats

Handling Parquet Files

DuckDB has advanced support for Parquet files, which includes directly querying Parquet files. When deciding on whether to query these files directly or to first load them to the database, you need to consider several factors.

Reasons for Querying Parquet Files

Availability of basic statistics: Parquet files use a columnar storage format and contain basic statistics such as zonemaps. Thanks to these features, DuckDB can leverage optimizations such as projection and filter pushdown on Parquet files. Therefore, workloads that combine projection, filtering, and aggregation tend to perform quite well when run on Parquet files.

Storage considerations: Loading the data from Parquet files will require approximately the same amount of space for the DuckDB database file. Therefore, if the available disk space is constrained, it is worth running the queries directly on Parquet files.

Reasons against Querying Parquet Files

Lack of advanced statistics: The DuckDB database format has the hyperloglog statistics that Parquet files do not have. These improve the accuracy of cardinality estimates, and are especially important if the queries contain a large number of join operators.

Tip. If you find that DuckDB produces a suboptimal join order on Parquet files, try loading the Parquet files to DuckDB tables. The improved statistics likely help obtain a better join order.

Repeated queries: If you plan to run multiple queries on the same dataset, it is worth loading the data into DuckDB. The queries will always be somewhat faster, which over time amortizes the initial load time.

High decompression times: Some Parquet files are compressed using heavyweight compression algorithms such as gzip. In these cases, querying the Parquet files will necessitate an expensive decompression time every time the file is accessed. Meanwhile, lightweight compression methods like Snappy, LZ4, and zstd, are faster to decompress. You may use the parquet_metadata function to find out the compression algorithm used.

Microbenchmark: Running TPC-H on a DuckDB Database vs. Parquet

The queries on the TPC-H benchmark run approximately 1.1-5.0× slower on Parquet files than on a DuckDB database.

Bestpractice If you have the storage space available, and have a join-heavy workload and/or plan to run many queries on the same dataset, load the Parquet files into the database first. The compression algorithm and the row group sizes in the Parquet files have a large effect on performance: study these using the parquet_metadata function.

The Effect of Row Group Sizes

DuckDB works best on Parquet files with row groups of 100K-1M rows each. The reason for this is that DuckDB can only parallelize over row groups – so if a Parquet file has a single giant row group it can only be processed by a single thread. You can use the parquet_metadata function to figure out how many row groups a Parquet file has. When writing Parquet files, use the row_group_size option.

Microbenchmark: Running Aggregation Query at Different Row Group Sizes

We run a simple aggregation query over Parquet files using different row group sizes, selected between 960 and 1,966,080. The results are as follows.

Row group size	Execution time
960	8.77 s
1920	8.95 s
3840	4.33 s
7680	2.35 s
15360	1.58 s
30720	1.17 s
61440	0.94 s
122880	0.87 s
245760	0.93 s
491520	0.95 s
983040	0.97 s
1966080	0.88 s

The results show that row group sizes <5,000 have a strongly detrimental effect, making runtimes more than 5-10× larger than ideally-sized row groups, while row group sizes between 5,000 and 20,000 are still 1.5-2.5× off from best performance. Above row group size of 100,000, the differences are small: the gap is about 10% between the best and the worst runtime.

Parquet File Sizes

DuckDB can also parallelize across multiple Parquet files. It is advisable to have at least as many total row groups across all files as there are CPU threads. For example, with a machine having 10 threads, both 10 files with 1 row group or 1 file with 10 row groups will achieve full parallelism. It is also beneficial to keep the size of individual Parquet files moderate.

Bestpractice The ideal range is between 100 MB and 10 GB per individual Parquet file.

Hive Partitioning for Filter Pushdown

When querying many files with filter conditions, performance can be improved by using a Hive-format folder structure to partition the data along the columns used in the filter condition. DuckDB will only need to read the folders and files that meet the filter criteria. This can be especially helpful when querying remote files.

Loading CSV Files

CSV files are often distributed in compressed format such as GZIP archives (.csv.gz). DuckDB can decompress these files on the fly. In fact, this is typically faster than decompressing the files first and loading them due to reduced IO.

Schema	Load time
Load from GZIP-compressed CSV files (`.csv.gz`)	107.1 s
Decompressing (using parallel `gunzip`) and loading from decompressed CSV files	121.3 s

Loading Many Small CSV Files

The CSV reader runs the CSV sniffer on all files. For many small files, this may cause an unnecessarily high overhead. A potential optimization to speed this up is to turn the sniffer off. Assuming that all files have the same CSV dialect and column names/types, get the sniffer options as follows:

.mode line
SELECT Prompt FROM sniff_csv('part-0001.csv');

Prompt = FROM read_csv('file_path.csv', auto_detect=false, delim=',', quote='"', escape='"', new_line='\n', skip=0, header=true, columns={'hello': 'BIGINT', 'world': 'VARCHAR'});

Then, you can adjust read_csv command, by e.g., applying filename expansion (globbing), and run with the rest of the options detected by the sniffer:

FROM read_csv('part-*.csv', auto_detect=false, delim=',', quote='"', escape='"', new_line='\n', skip=0, header=true, columns={'hello': 'BIGINT', 'world': 'VARCHAR'});

About this page

Report content issue
See this page as Markdown
Edit this page on GitHub

Code of Conduct Trademark Use