Carview!

HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 date: Tue, 14 Oct 2025 19:18:38 GMT content-type: text/html; charset=utf-8 nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800} server: cloudflare last-modified: Tue, 14 Oct 2025 15:46:39 GMT access-control-allow-origin: * expires: Tue, 14 Oct 2025 19:28:37 GMT cache-control: max-age=600 report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=d0rU4pkZkQdg7IvIJZmdRJmRhTVmKlYjbGqMBITeYlTNuVYludCIjh%2FErnm69Hne%2F7FdCQDGvklnFh00z7lhuqa2T0E5RiaHrg%3D%3D"}]} x-proxy-cache: MISS x-github-request-id: B6C4:3F1E67:E0798:107EC7:68EEA20D age: 0 via: 1.1 varnish x-served-by: cache-bom-vanm7210029-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1760469518.796792,VS0,VE299 vary: Accept-Encoding x-fastly-request-id: a42b9472aa71637968e2e006b56cc5a943f6b33d cf-cache-status: DYNAMIC content-encoding: gzip cf-ray: 98e96c75fdc0c7cf-BLR alt-svc: h3=":443"; ma=86400 Integration with Ibis – DuckDB

⌘+k ctrl+k

1.4 (stable)

1.5-dev
1.4
1.3
1.2
1.1
1.0
0.10
0.9
0.8

Search Shortcut cmd + k | ctrl + k

Installation
Documentation

Getting Started
Connect

Overview
Concurrency

Data Import

Overview
Data Sources
CSV Files

Overview
Auto Detection
Reading Faulty CSV Files
Tips

JSON Files

Overview
Creating JSON
Loading JSON
Writing JSON
JSON Type
JSON Functions
Format Settings
Installing and Loading
SQL to / from JSON
Caveats

Multiple Files

Overview
Combining Schemas

Parquet Files

Overview
Metadata
Encryption
Tips

Partitioning

Hive Partitioning
Partitioned Writes

Appender
INSERT Statements

Client APIs

Overview
Tertiary Clients
ADBC
C

Overview
Startup
Configuration
Query
Data Chunks
Vectors
Values
Types
Prepared Statements
Appender
Table Functions
Replacement Scans
API Reference

Overview
Arguments
Dot Commands
Output Formats
Editing
Safe Mode
Autocomplete
Syntax Highlighting
Known Issues

Dart
Go
Java (JDBC)
Julia
Node.js (Deprecated)

Overview
API Reference

Node.js (Neo)

Overview

ODBC

Overview
Linux Setup
Windows Setup
macOS Setup
Configuration

PHP
Python

Overview
Data Ingestion
Conversion between DuckDB and Python
DB API
Relational API
Function API
Types API
Expression API
Spark API
API Reference
Known Python Issues

R
Rust
Swift
Wasm

Overview
Deploying DuckDB-Wasm
Instantiation
Data Ingestion
Query
Extensions

Introduction
Statements

Overview
ANALYZE
ALTER DATABASE
ALTER TABLE
ALTER VIEW
ATTACH and DETACH
CALL
CHECKPOINT
COMMENT ON
COPY
CREATE INDEX
CREATE MACRO
CREATE SCHEMA
CREATE SECRET
CREATE SEQUENCE
CREATE TABLE
CREATE VIEW
CREATE TYPE
DELETE
DESCRIBE
DROP
EXPORT and IMPORT DATABASE
INSERT
LOAD / INSTALL
MERGE INTO
PIVOT
Profiling
SELECT
SET / RESET
SET VARIABLE
SUMMARIZE
Transaction Management
UNPIVOT
UPDATE
USE
VACUUM

Query Syntax

SELECT
FROM and JOIN
WHERE
GROUP BY
GROUPING SETS
HAVING
ORDER BY
LIMIT and OFFSET
SAMPLE
Unnesting
WITH
WINDOW
QUALIFY
VALUES
FILTER
Set Operations
Prepared Statements

Data Types

Overview
Array
Bitstring
Blob
Boolean
Date
Enum
Interval
List
Literal Types
Map
NULL Values
Numeric
Struct
Text
Time
Timestamp
Time Zones
Union
Typecasting

Expressions

Overview
CASE Expression
Casting
Collations
Comparisons
IN Operator
Logical Operators
Star Expression
Subqueries
TRY

Functions

Overview
Aggregate Functions
Array Functions
Bitstring Functions
Blob Functions
Date Format Functions
Date Functions
Date Part Functions
Enum Functions
Interval Functions
Lambda Functions
List Functions
Map Functions
Nested Functions
Numeric Functions
Pattern Matching
Regular Expressions
Struct Functions
Text Functions
Time Functions
Timestamp Functions
Timestamp with Time Zone Functions
Union Functions
Utility Functions
Window Functions

Constraints
Indexes
Meta Queries

Information Schema
Metadata Functions

DuckDB's SQL Dialect

Overview
Indexing
Friendly SQL
Keywords and Identifiers
Order Preservation
PostgreSQL Compatibility
SQL Quirks

Samples

Configuration

Overview
Pragmas
Secrets Manager

Extensions

Overview
Installing Extensions
Advanced Installation Methods
Distributing Extensions
Versioning of Extensions
Troubleshooting of Extensions

Core Extensions

Overview
AutoComplete
Avro
AWS
Azure
Delta
DuckLake
Encodings
Excel
Full Text Search
httpfs (HTTP and S3)

Overview
HTTP(S) Support
Hugging Face
S3 API Support
Legacy Authentication Scheme for S3 API

Iceberg

Overview
Iceberg REST Catalogs
Amazon S3 Tables
Amazon SageMaker Lakehouse (AWS Glue)
Troubleshooting

ICU
inet
jemalloc
MySQL
PostgreSQL
Spatial

Overview
Function Reference
R-Tree Indexes
GDAL Integration

SQLite
TPC-DS
TPC-H
UI
VSS

Guides

Overview
Data Viewers

Tableau
CLI Charting with YouPlot

Database Integration

Overview
MySQL Import
PostgreSQL Import
SQLite Import

File Formats

Overview
CSV Import
CSV Export
Directly Reading Files
Excel Import
Excel Export
JSON Import
JSON Export
Parquet Import
Parquet Export
Querying Parquet Files
File Access with the file: Protocol

Network and Cloud Storage

Overview
HTTP Parquet Import
S3 Parquet Import
S3 Parquet Export
S3 Iceberg Import
S3 Express One
GCS Import
Cloudflare R2 Import
DuckDB over HTTPS / S3
Fastly Object Storage Import

Meta Queries

Describe Table
EXPLAIN: Inspect Query Plans
EXPLAIN ANALYZE: Profile Queries
List Tables
Summarize
DuckDB Environment

ODBC

ODBC Guide

Performance

Overview
Environment
Import
Schema
Indexing
Join Operations
File Formats
How to Tune Workloads
My Workload Is Slow
Benchmarks
Working with Huge Databases

Python

Installation
Executing SQL
Jupyter Notebooks
marimo Notebooks
SQL on Pandas
Import from Pandas
Export to Pandas
Import from Numpy
Export to Numpy
SQL on Arrow
Import from Arrow
Export to Arrow
Relational API on Pandas
Multiple Python Threads
Integration with Ibis
Integration with Polars
Using fsspec Filesystems

SQL Editors

DBeaver SQL IDE

SQL Features

AsOf Join
Full-Text Search
query and query_table Functions
Timestamp Issues

Snippets

Creating Synthetic Data
Dutch Railway Datasets
Sharing Macros
Analyzing a Git Repository
Importing Duckbox Tables
Copying an In-Memory Database to a File

Troubleshooting

Crashes
Out of Memory Errors

Glossary of Terms
Browsing Offline

Operations Manual

Overview
DuckDB's Footprint

Files Created by DuckDB
Gitignore for DuckDB
Reclaiming Space

Logging

Overview

Securing DuckDB

Overview
Embedding DuckDB
Securing Extensions

Non-Deterministic Behavior
Limits
DuckDB Docker Container

Development

DuckDB Repositories
Profiling
Building DuckDB

Overview
Build Configuration
Building Extensions
Android
Linux
macOS
Raspberry Pi
Windows
Python
R
Troubleshooting
Unofficial and Unsupported Platforms

Benchmark Suite
Testing

Overview
sqllogictest Introduction
Writing Tests
Debugging
Result Verification
Persistent Testing
Loops
Multiple Connections
Catch

Internals

Overview
Storage Versions and Format
Execution Format
Pivot

Sitemap
Live Demo

Documentation / Guides / Python

Integration with Ibis

Ibis is a Python dataframe library that supports 20+ backends, with DuckDB as the default. Ibis with DuckDB provides a Pythonic interface for SQL with great performance.

Installation

You can pip install Ibis with the DuckDB backend:

pip install 'ibis-framework[duckdb,examples]' # examples is only required to access the sample data Ibis provides

or use conda:

conda install ibis-framework

or use mamba:

mamba install ibis-framework

Create a Database File

Ibis can work with several file types, but at its core, it connects to existing databases and interacts with the data there. You can get started with your own DuckDB databases or create a new one with example data.

import ibis
con = ibis.connect("duckdb://penguins.ddb")
con.create_table(
    "penguins", ibis.examples.penguins.fetch().to_pyarrow(), overwrite = True
)

# Output:
DatabaseTable: penguins
  species           string
  island            string
  bill_length_mm    float64
  bill_depth_mm     float64
  flipper_length_mm int64
  body_mass_g       int64
  sex               string
  year              int64

You can now see the example dataset copied over to the database:

# reconnect to the persisted database (dropping temp tables)
con = ibis.connect("duckdb://penguins.ddb")
con.list_tables()

# Output:
['penguins']

There's one table, called penguins. We can ask Ibis to give us an object that we can interact with.

penguins = con.table("penguins")
penguins

# Output:
DatabaseTable: penguins
  species           string
  island            string
  bill_length_mm    float64
  bill_depth_mm     float64
  flipper_length_mm int64
  body_mass_g       int64
  sex               string
  year              int64

Ibis is lazily evaluated, so instead of seeing the data, we see the schema of the table. To peek at the data, we can call head and then to_pandas to get the first few rows of the table as a pandas DataFrame.

penguins.head().to_pandas()

  species     island  bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g     sex  year
Adelie  Torgersen            39.1           18.7              181.0       3750.0    male  2007
Adelie  Torgersen            39.5           17.4              186.0       3800.0  female  2007
Adelie  Torgersen            40.3           18.0              195.0       3250.0  female  2007
Adelie  Torgersen             NaN            NaN                NaN          NaN    None  2007
Adelie  Torgersen            36.7           19.3              193.0       3450.0  female  2007

to_pandas takes the existing lazy table expression and evaluates it. If we leave it off, you'll see the Ibis representation of the table expression that to_pandas will evaluate (when you're ready!).

penguins.head()

# Output:
r0 := DatabaseTable: penguins
  species           string
  island            string
  bill_length_mm    float64
  bill_depth_mm     float64
  flipper_length_mm int64
  body_mass_g       int64
  sex               string
  year              int64
Limit[r0, n=5]

Ibis returns results as a pandas DataFrame using to_pandas, but isn't using pandas to perform any of the computation. The query is executed by DuckDB. Only when to_pandas is called does Ibis then pull back the results and convert them into a DataFrame.

Interactive Mode

For the rest of this intro, we'll turn on interactive mode, which partially executes queries to give users a preview of the results. There is a small difference in the way the output is formatted, but otherwise this is the same as calling to_pandas on the table expression with a limit of 10 result rows returned.

ibis.options.interactive = True
penguins.head()

┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species ┃ island    ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex    ┃ year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ string  │ string    │ float64        │ float64       │ int64             │ int64       │ string │ int64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie  │ Torgersen │           39.1 │          18.7 │               181 │        3750 │ male   │  2007 │
│ Adelie  │ Torgersen │           39.5 │          17.4 │               186 │        3800 │ female │  2007 │
│ Adelie  │ Torgersen │           40.3 │          18.0 │               195 │        3250 │ female │  2007 │
│ Adelie  │ Torgersen │            nan │           nan │              NULL │        NULL │ NULL   │  2007 │
│ Adelie  │ Torgersen │           36.7 │          19.3 │               193 │        3450 │ female │  2007 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

Common Operations

Ibis has a collection of useful table methods to manipulate and query the data in a table.

filter

filter allows you to select rows based on a condition or set of conditions.

We can filter so we only have penguins of the species Adelie:

penguins.filter(penguins.species == "Gentoo")

┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex    ┃ year  ┃
┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ string  │ string │ float64        │ float64       │ int64             │ int64       │ string │ int64 │
├─────────┼────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Gentoo  │ Biscoe │           46.1 │          13.2 │               211 │        4500 │ female │  2007 │
│ Gentoo  │ Biscoe │           50.0 │          16.3 │               230 │        5700 │ male   │  2007 │
│ Gentoo  │ Biscoe │           48.7 │          14.1 │               210 │        4450 │ female │  2007 │
│ Gentoo  │ Biscoe │           50.0 │          15.2 │               218 │        5700 │ male   │  2007 │
│ Gentoo  │ Biscoe │           47.6 │          14.5 │               215 │        5400 │ male   │  2007 │
│ Gentoo  │ Biscoe │           46.5 │          13.5 │               210 │        4550 │ female │  2007 │
│ Gentoo  │ Biscoe │           45.4 │          14.6 │               211 │        4800 │ female │  2007 │
│ Gentoo  │ Biscoe │           46.7 │          15.3 │               219 │        5200 │ male   │  2007 │
│ Gentoo  │ Biscoe │           43.3 │          13.4 │               209 │        4400 │ female │  2007 │
│ Gentoo  │ Biscoe │           46.8 │          15.4 │               215 │        5150 │ male   │  2007 │
│ …       │ …      │              … │             … │                 … │           … │ …      │     … │
└─────────┴────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

Or filter for Gentoo penguins that have a body mass larger than 6 kg.

penguins.filter((penguins.species == "Gentoo") & (penguins.body_mass_g > 6000))

┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex    ┃ year  ┃
┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ string  │ string │ float64        │ float64       │ int64             │ int64       │ string │ int64 │
├─────────┼────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Gentoo  │ Biscoe │           49.2 │          15.2 │               221 │        6300 │ male   │  2007 │
│ Gentoo  │ Biscoe │           59.6 │          17.0 │               230 │        6050 │ male   │  2007 │
└─────────┴────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

You can use any Boolean comparison in a filter (although if you try to do something like use < on a string, Ibis will yell at you).

select

Your data analysis might not require all the columns present in a given table. select lets you pick out only those columns that you want to work with.

To select a column you can use the name of the column as a string:

penguins.select("species", "island", "year").limit(3)

┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species ┃ island    ┃ year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ string  │ string    │ int64 │
├─────────┼───────────┼───────┤
│ Adelie  │ Torgersen │  2007 │
│ Adelie  │ Torgersen │  2007 │
│ Adelie  │ Torgersen │  2007 │
│ …       │ …         │     … │
└─────────┴───────────┴───────┘

Or you can use column objects directly (this can be convenient when paired with tab-completion):

penguins.select(penguins.species, penguins.island, penguins.year).limit(3)

┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species ┃ island    ┃ year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ string  │ string    │ int64 │
├─────────┼───────────┼───────┤
│ Adelie  │ Torgersen │  2007 │
│ Adelie  │ Torgersen │  2007 │
│ Adelie  │ Torgersen │  2007 │
│ …       │ …         │     … │
└─────────┴───────────┴───────┘

Or you can mix-and-match:

penguins.select("species", "island", penguins.year).limit(3)

┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species ┃ island    ┃ year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ string  │ string    │ int64 │
├─────────┼───────────┼───────┤
│ Adelie  │ Torgersen │  2007 │
│ Adelie  │ Torgersen │  2007 │
│ Adelie  │ Torgersen │  2007 │
│ …       │ …         │     … │
└─────────┴───────────┴───────┘

mutate

mutate lets you add new columns to your table, derived from the values of existing columns.

penguins.mutate(bill_length_cm=penguins.bill_length_mm / 10)

┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ species ┃ island    ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex    ┃ year  ┃ bill_length_cm ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ string  │ string    │ float64        │ float64       │ int64             │ int64       │ string │ int64 │ float64        │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┼────────────────┤
│ Adelie  │ Torgersen │           39.1 │          18.7 │               181 │        3750 │ male   │  2007 │           3.91 │
│ Adelie  │ Torgersen │           39.5 │          17.4 │               186 │        3800 │ female │  2007 │           3.95 │
│ Adelie  │ Torgersen │           40.3 │          18.0 │               195 │        3250 │ female │  2007 │           4.03 │
│ Adelie  │ Torgersen │            nan │           nan │              NULL │        NULL │ NULL   │  2007 │            nan │
│ Adelie  │ Torgersen │           36.7 │          19.3 │               193 │        3450 │ female │  2007 │           3.67 │
│ Adelie  │ Torgersen │           39.3 │          20.6 │               190 │        3650 │ male   │  2007 │           3.93 │
│ Adelie  │ Torgersen │           38.9 │          17.8 │               181 │        3625 │ female │  2007 │           3.89 │
│ Adelie  │ Torgersen │           39.2 │          19.6 │               195 │        4675 │ male   │  2007 │           3.92 │
│ Adelie  │ Torgersen │           34.1 │          18.1 │               193 │        3475 │ NULL   │  2007 │           3.41 │
│ Adelie  │ Torgersen │           42.0 │          20.2 │               190 │        4250 │ NULL   │  2007 │           4.20 │
│ …       │ …         │              … │             … │                 … │           … │ …      │     … │              … │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┴────────────────┘

Notice that the table is a little too wide to display all the columns now (depending on your screen-size). bill_length is now present in millimeters and centimeters. Use a select to trim down the number of columns we're looking at.

penguins.mutate(bill_length_cm=penguins.bill_length_mm / 10).select(
    "species",
    "island",
    "bill_depth_mm",
    "flipper_length_mm",
    "body_mass_g",
    "sex",
    "year",
    "bill_length_cm",
)

┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ species ┃ island    ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex    ┃ year  ┃ bill_length_cm ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ string  │ string    │ float64       │ int64             │ int64       │ string │ int64 │ float64        │
├─────────┼───────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┼────────────────┤
│ Adelie  │ Torgersen │          18.7 │               181 │        3750 │ male   │  2007 │           3.91 │
│ Adelie  │ Torgersen │          17.4 │               186 │        3800 │ female │  2007 │           3.95 │
│ Adelie  │ Torgersen │          18.0 │               195 │        3250 │ female │  2007 │           4.03 │
│ Adelie  │ Torgersen │           nan │              NULL │        NULL │ NULL   │  2007 │            nan │
│ Adelie  │ Torgersen │          19.3 │               193 │        3450 │ female │  2007 │           3.67 │
│ Adelie  │ Torgersen │          20.6 │               190 │        3650 │ male   │  2007 │           3.93 │
│ Adelie  │ Torgersen │          17.8 │               181 │        3625 │ female │  2007 │           3.89 │
│ Adelie  │ Torgersen │          19.6 │               195 │        4675 │ male   │  2007 │           3.92 │
│ Adelie  │ Torgersen │          18.1 │               193 │        3475 │ NULL   │  2007 │           3.41 │
│ Adelie  │ Torgersen │          20.2 │               190 │        4250 │ NULL   │  2007 │           4.20 │
│ …       │ …         │             … │                 … │           … │ …      │     … │              … │
└─────────┴───────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┴────────────────┘

selectors

Typing out all of the column names except one is a little annoying. Instead of doing that again, we can use a selector to quickly select or deselect groups of columns.

import ibis.selectors as s
penguins.mutate(bill_length_cm=penguins.bill_length_mm / 10).select(
    ~s.matches("bill_length_mm")
    # match every column except `bill_length_mm`
)

┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ species ┃ island    ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex    ┃ year  ┃ bill_length_cm ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ string  │ string    │ float64       │ int64             │ int64       │ string │ int64 │ float64        │
├─────────┼───────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┼────────────────┤
│ Adelie  │ Torgersen │          18.7 │               181 │        3750 │ male   │  2007 │           3.91 │
│ Adelie  │ Torgersen │          17.4 │               186 │        3800 │ female │  2007 │           3.95 │
│ Adelie  │ Torgersen │          18.0 │               195 │        3250 │ female │  2007 │           4.03 │
│ Adelie  │ Torgersen │           nan │              NULL │        NULL │ NULL   │  2007 │            nan │
│ Adelie  │ Torgersen │          19.3 │               193 │        3450 │ female │  2007 │           3.67 │
│ Adelie  │ Torgersen │          20.6 │               190 │        3650 │ male   │  2007 │           3.93 │
│ Adelie  │ Torgersen │          17.8 │               181 │        3625 │ female │  2007 │           3.89 │
│ Adelie  │ Torgersen │          19.6 │               195 │        4675 │ male   │  2007 │           3.92 │
│ Adelie  │ Torgersen │          18.1 │               193 │        3475 │ NULL   │  2007 │           3.41 │
│ Adelie  │ Torgersen │          20.2 │               190 │        4250 │ NULL   │  2007 │           4.20 │
│ …       │ …         │             … │                 … │           … │ …      │     … │              … │
└─────────┴───────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┴────────────────┘

You can also use a selector alongside a column name.

penguins.select("island", s.numeric())

┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━┓
┃ island    ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ year  ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━┩
│ string    │ float64        │ float64       │ int64             │ int64       │ int64 │
├───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼───────┤
│ Torgersen │           39.1 │          18.7 │               181 │        3750 │  2007 │
│ Torgersen │           39.5 │          17.4 │               186 │        3800 │  2007 │
│ Torgersen │           40.3 │          18.0 │               195 │        3250 │  2007 │
│ Torgersen │            nan │           nan │              NULL │        NULL │  2007 │
│ Torgersen │           36.7 │          19.3 │               193 │        3450 │  2007 │
│ Torgersen │           39.3 │          20.6 │               190 │        3650 │  2007 │
│ Torgersen │           38.9 │          17.8 │               181 │        3625 │  2007 │
│ Torgersen │           39.2 │          19.6 │               195 │        4675 │  2007 │
│ Torgersen │           34.1 │          18.1 │               193 │        3475 │  2007 │
│ Torgersen │           42.0 │          20.2 │               190 │        4250 │  2007 │
│ …         │              … │             … │                 … │           … │     … │
└───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴───────┘

You can read more about selectors in the docs!

`order_by`

order_by arranges the values of one or more columns in ascending or descending order.

By default, ibis sorts in ascending order:

penguins.order_by(penguins.flipper_length_mm).select(
    "species", "island", "flipper_length_mm"
)

┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ species   ┃ island    ┃ flipper_length_mm ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ string    │ string    │ int64             │
├───────────┼───────────┼───────────────────┤
│ Adelie    │ Biscoe    │               172 │
│ Adelie    │ Biscoe    │               174 │
│ Adelie    │ Torgersen │               176 │
│ Adelie    │ Dream     │               178 │
│ Adelie    │ Dream     │               178 │
│ Adelie    │ Dream     │               178 │
│ Chinstrap │ Dream     │               178 │
│ Adelie    │ Dream     │               179 │
│ Adelie    │ Torgersen │               180 │
│ Adelie    │ Biscoe    │               180 │
│ …         │ …         │                 … │
└───────────┴───────────┴───────────────────┘

You can sort in descending order using the desc method of a column:

penguins.order_by(penguins.flipper_length_mm.desc()).select(
    "species", "island", "flipper_length_mm"
)

┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ species ┃ island ┃ flipper_length_mm ┃
┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ string  │ string │ int64             │
├─────────┼────────┼───────────────────┤
│ Gentoo  │ Biscoe │               231 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               229 │
│ Gentoo  │ Biscoe │               229 │
│ …       │ …      │                 … │
└─────────┴────────┴───────────────────┘

Or you can use ibis.desc

penguins.order_by(ibis.desc("flipper_length_mm")).select(
    "species", "island", "flipper_length_mm"
)

┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ species ┃ island ┃ flipper_length_mm ┃
┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ string  │ string │ int64             │
├─────────┼────────┼───────────────────┤
│ Gentoo  │ Biscoe │               231 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               230 │
│ Gentoo  │ Biscoe │               229 │
│ Gentoo  │ Biscoe │               229 │
│ …       │ …      │                 … │
└─────────┴────────┴───────────────────┘

aggregate

Ibis has several aggregate functions available to help summarize data.

mean, max, min, count, sum (the list goes on).

To aggregate an entire column, call the corresponding method on that column.

penguins.flipper_length_mm.mean()

# Output:
200.91520467836258

You can compute multiple aggregates at once using the aggregate method:

penguins.aggregate([penguins.flipper_length_mm.mean(), penguins.bill_depth_mm.max()])

┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ Mean(flipper_length_mm) ┃ Max(bill_depth_mm) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ float64                 │ float64            │
├─────────────────────────┼────────────────────┤
│              200.915205 │               21.5 │
└─────────────────────────┴────────────────────┘

But aggregate really shines when it's paired with group_by.

`group_by`

group_by creates groupings of rows that have the same value for one or more columns.

But it doesn't do much on its own – you can pair it with aggregate to get a result.

penguins.group_by("species").aggregate()

┏━━━━━━━━━━━┓
┃ species   ┃
┡━━━━━━━━━━━┩
│ string    │
├───────────┤
│ Adelie    │
│ Gentoo    │
│ Chinstrap │
└───────────┘

We grouped by the species column and handed it an “empty” aggregate command. The result of that is a column of the unique values in the species column.

If we add a second column to the group_by, we'll get each unique pairing of the values in those columns.

penguins.group_by(["species", "island"]).aggregate()

┏━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ species   ┃ island    ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━┩
│ string    │ string    │
├───────────┼───────────┤
│ Adelie    │ Torgersen │
│ Adelie    │ Biscoe    │
│ Adelie    │ Dream     │
│ Gentoo    │ Biscoe    │
│ Chinstrap │ Dream     │
└───────────┴───────────┘

Now, if we add an aggregation function to that, we start to really open things up.

penguins.group_by(["species", "island"]).aggregate(penguins.bill_length_mm.mean())

┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
┃ species   ┃ island    ┃ Mean(bill_length_mm) ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
│ string    │ string    │ float64              │
├───────────┼───────────┼──────────────────────┤
│ Adelie    │ Torgersen │            38.950980 │
│ Adelie    │ Biscoe    │            38.975000 │
│ Adelie    │ Dream     │            38.501786 │
│ Gentoo    │ Biscoe    │            47.504878 │
│ Chinstrap │ Dream     │            48.833824 │
└───────────┴───────────┴──────────────────────┘

By adding that mean to the aggregate, we now have a concise way to calculate aggregates over each of the distinct groups in the group_by. And we can calculate as many aggregates as we need.

penguins.group_by(["species", "island"]).aggregate(
    [penguins.bill_length_mm.mean(), penguins.flipper_length_mm.max()]
)

┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ species   ┃ island    ┃ Mean(bill_length_mm) ┃ Max(flipper_length_mm) ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string    │ string    │ float64              │ int64                  │
├───────────┼───────────┼──────────────────────┼────────────────────────┤
│ Adelie    │ Torgersen │            38.950980 │                    210 │
│ Adelie    │ Biscoe    │            38.975000 │                    203 │
│ Adelie    │ Dream     │            38.501786 │                    208 │
│ Gentoo    │ Biscoe    │            47.504878 │                    231 │
│ Chinstrap │ Dream     │            48.833824 │                    212 │
└───────────┴───────────┴──────────────────────┴────────────────────────┘

If we need more specific groups, we can add to the group_by.

penguins.group_by(["species", "island", "sex"]).aggregate(
    [penguins.bill_length_mm.mean(), penguins.flipper_length_mm.max()]
)

┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ species ┃ island    ┃ sex    ┃ Mean(bill_length_mm) ┃ Max(flipper_length_mm) ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string  │ string    │ string │ float64              │ int64                  │
├─────────┼───────────┼────────┼──────────────────────┼────────────────────────┤
│ Adelie  │ Torgersen │ male   │            40.586957 │                    210 │
│ Adelie  │ Torgersen │ female │            37.554167 │                    196 │
│ Adelie  │ Torgersen │ NULL   │            37.925000 │                    193 │
│ Adelie  │ Biscoe    │ female │            37.359091 │                    199 │
│ Adelie  │ Biscoe    │ male   │            40.590909 │                    203 │
│ Adelie  │ Dream     │ female │            36.911111 │                    202 │
│ Adelie  │ Dream     │ male   │            40.071429 │                    208 │
│ Adelie  │ Dream     │ NULL   │            37.500000 │                    179 │
│ Gentoo  │ Biscoe    │ female │            45.563793 │                    222 │
│ Gentoo  │ Biscoe    │ male   │            49.473770 │                    231 │
│ …       │ …         │ …      │                    … │                      … │
└─────────┴───────────┴────────┴──────────────────────┴────────────────────────┘

Chaining It All Together

We've already chained some Ibis calls together. We used mutate to create a new column and then select to only view a subset of the new table. We were just chaining group_by with aggregate.

There's nothing stopping us from putting all of these concepts together to ask questions of the data.

How about:

What was the largest female penguin (by body mass) on each island in the year 2008?

penguins.filter((penguins.sex == "female") & (penguins.year == 2008)).group_by(
    ["island"]
).aggregate(penguins.body_mass_g.max())

┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ island    ┃ Max(body_mass_g) ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ string    │ int64            │
├───────────┼──────────────────┤
│ Biscoe    │             5200 │
│ Torgersen │             3800 │
│ Dream     │             3900 │
└───────────┴──────────────────┘

What about the largest male penguin (by body mass) on each island for each year of data collection?

penguins.filter(penguins.sex == "male").group_by(["island", "year"]).aggregate(
    penguins.body_mass_g.max().name("max_body_mass")
).order_by(["year", "max_body_mass"])

┏━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ island    ┃ year  ┃ max_body_mass ┃
┡━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━┩
│ string    │ int64 │ int64         │
├───────────┼───────┼───────────────┤
│ Dream     │  2007 │          4650 │
│ Torgersen │  2007 │          4675 │
│ Biscoe    │  2007 │          6300 │
│ Torgersen │  2008 │          4700 │
│ Dream     │  2008 │          4800 │
│ Biscoe    │  2008 │          6000 │
│ Torgersen │  2009 │          4300 │
│ Dream     │  2009 │          4475 │
│ Biscoe    │  2009 │          6000 │
└───────────┴───────┴───────────────┘

Learn More

That's all for this quick-start guide. If you want to learn more, check out the Ibis documentation.

About this page

Report content issue
See this page as Markdown
Edit this page on GitHub

Code of Conduct Trademark Use

About this page

In this article