CARVIEW |
- Installation
- Documentation
- Getting Started
- Connect
- Data Import
- Overview
- Data Sources
- CSV Files
- JSON Files
- Overview
- Creating JSON
- Loading JSON
- Writing JSON
- JSON Type
- JSON Functions
- Format Settings
- Installing and Loading
- SQL to / from JSON
- Caveats
- Multiple Files
- Parquet Files
- Partitioning
- Appender
- INSERT Statements
- Client APIs
- Overview
- ADBC
- C
- Overview
- Startup
- Configuration
- Query
- Data Chunks
- Vectors
- Values
- Types
- Prepared Statements
- Appender
- Table Functions
- Replacement Scans
- API Reference
- C++
- CLI
- Dart
- Go
- Java (JDBC)
- Julia
- Node.js (Deprecated)
- Node.js (Neo)
- ODBC
- PHP
- Python
- Overview
- Data Ingestion
- Conversion between DuckDB and Python
- DB API
- Relational API
- Function API
- Types API
- Expression API
- Spark API
- API Reference
- Known Python Issues
- R
- Rust
- Swift
- Wasm
- SQL
- Introduction
- Statements
- Overview
- ANALYZE
- ALTER TABLE
- ALTER VIEW
- ATTACH and DETACH
- CALL
- CHECKPOINT
- COMMENT ON
- COPY
- CREATE INDEX
- CREATE MACRO
- CREATE SCHEMA
- CREATE SECRET
- CREATE SEQUENCE
- CREATE TABLE
- CREATE VIEW
- CREATE TYPE
- DELETE
- DESCRIBE
- DROP
- EXPORT and IMPORT DATABASE
- INSERT
- LOAD / INSTALL
- PIVOT
- Profiling
- SELECT
- SET / RESET
- SET VARIABLE
- SUMMARIZE
- Transaction Management
- UNPIVOT
- UPDATE
- USE
- VACUUM
- Query Syntax
- SELECT
- FROM and JOIN
- WHERE
- GROUP BY
- GROUPING SETS
- HAVING
- ORDER BY
- LIMIT and OFFSET
- SAMPLE
- Unnesting
- WITH
- WINDOW
- QUALIFY
- VALUES
- FILTER
- Set Operations
- Prepared Statements
- Data Types
- Overview
- Array
- Bitstring
- Blob
- Boolean
- Date
- Enum
- Interval
- List
- Literal Types
- Map
- NULL Values
- Numeric
- Struct
- Text
- Time
- Timestamp
- Time Zones
- Union
- Typecasting
- Expressions
- Overview
- CASE Expression
- Casting
- Collations
- Comparisons
- IN Operator
- Logical Operators
- Star Expression
- Subqueries
- TRY
- Functions
- Overview
- Aggregate Functions
- Array Functions
- Bitstring Functions
- Blob Functions
- Date Format Functions
- Date Functions
- Date Part Functions
- Enum Functions
- Interval Functions
- Lambda Functions
- List Functions
- Map Functions
- Nested Functions
- Numeric Functions
- Pattern Matching
- Regular Expressions
- Struct Functions
- Text Functions
- Time Functions
- Timestamp Functions
- Timestamp with Time Zone Functions
- Union Functions
- Utility Functions
- Window Functions
- Constraints
- Indexes
- Meta Queries
- DuckDB's SQL Dialect
- Overview
- Indexing
- Friendly SQL
- Keywords and Identifiers
- Order Preservation
- PostgreSQL Compatibility
- SQL Quirks
- Samples
- Configuration
- Extensions
- Overview
- Installing Extensions
- Advanced Installation Methods
- Distributing Extensions
- Versioning of Extensions
- Troubleshooting of Extensions
- Core Extensions
- Overview
- AutoComplete
- Avro
- AWS
- Azure
- Delta
- DuckLake
- Encodings
- Excel
- Full Text Search
- httpfs (HTTP and S3)
- Iceberg
- Overview
- Iceberg REST Catalogs
- Amazon S3 Tables
- Amazon SageMaker Lakehouse (AWS Glue)
- Troubleshooting
- ICU
- inet
- jemalloc
- MySQL
- PostgreSQL
- Spatial
- SQLite
- TPC-DS
- TPC-H
- UI
- VSS
- Guides
- Overview
- Data Viewers
- Database Integration
- File Formats
- Overview
- CSV Import
- CSV Export
- Directly Reading Files
- Excel Import
- Excel Export
- JSON Import
- JSON Export
- Parquet Import
- Parquet Export
- Querying Parquet Files
- File Access with the file: Protocol
- Network and Cloud Storage
- Overview
- HTTP Parquet Import
- S3 Parquet Import
- S3 Parquet Export
- S3 Iceberg Import
- S3 Express One
- GCS Import
- Cloudflare R2 Import
- DuckDB over HTTPS / S3
- Fastly Object Storage Import
- Meta Queries
- Describe Table
- EXPLAIN: Inspect Query Plans
- EXPLAIN ANALYZE: Profile Queries
- List Tables
- Summarize
- DuckDB Environment
- ODBC
- Performance
- Overview
- Environment
- Import
- Schema
- Indexing
- Join Operations
- File Formats
- How to Tune Workloads
- My Workload Is Slow
- Benchmarks
- Working with Huge Databases
- Python
- Installation
- Executing SQL
- Jupyter Notebooks
- marimo Notebooks
- SQL on Pandas
- Import from Pandas
- Export to Pandas
- Import from Numpy
- Export to Numpy
- SQL on Arrow
- Import from Arrow
- Export to Arrow
- Relational API on Pandas
- Multiple Python Threads
- Integration with Ibis
- Integration with Polars
- Using fsspec Filesystems
- SQL Editors
- SQL Features
- Snippets
- Creating Synthetic Data
- Dutch Railway Datasets
- Sharing Macros
- Analyzing a Git Repository
- Importing Duckbox Tables
- Copying an In-Memory Database to a File
- Troubleshooting
- Glossary of Terms
- Browsing Offline
- Operations Manual
- Overview
- DuckDB's Footprint
- Logging
- Securing DuckDB
- Non-Deterministic Behavior
- Limits
- Development
- DuckDB Repositories
- Profiling
- Building DuckDB
- Overview
- Build Configuration
- Building Extensions
- Android
- Linux
- macOS
- Raspberry Pi
- Windows
- Python
- R
- Troubleshooting
- Unofficial and Unsupported Platforms
- Benchmark Suite
- Testing
- Internals
- Why DuckDB
- Code of Conduct
- Release Calendar
- Roadmap
- Sitemap
- Live Demo
Name | Description |
---|---|
ENUM |
Dictionary representing all possible string values of a column |
The enum type represents a dictionary data structure with all possible unique values of a column. For example, a column storing the days of the week can be an enum holding all possible days. Enums are particularly interesting for string columns with low cardinality (i.e., fewer distinct values). This is because the column only stores a numerical reference to the string in the enum dictionary, resulting in immense savings in disk storage and faster query performance.
Enum Definition
Enum types are created from either a hardcoded set of values or from a select statement that returns a single column of VARCHAR
s. The set of values in the select statement will be deduplicated, but if the enum is created from a hardcoded set there may not be any duplicates.
Create enum using hardcoded values:
CREATE TYPE enum_name AS ENUM (value_1, value_2, ...);
Create enum using a SELECT
statement that returns a single column of VARCHAR
s:
CREATE TYPE enum_name AS ENUM (select_expression);
Enums can also be created on the fly during casting:
SELECT 'some_string'::ENUM (value_1, value_2, ...);
Examples
Creates new user defined type mood
as an enum:
CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
Create an enum from a select statement. First create an example table of values:
CREATE TABLE my_inputs AS
FROM (VALUES ('duck'), ('duck'), ('goose')) t(my_varchar);
Create an anonymous enum value on the fly:
SELECT 'happy'::ENUM ('sad', 'ok', 'happy');
This statement will fail since enums cannot hold NULL
values:
CREATE TYPE breed AS ENUM ('maltese', NULL);
This statement will fail since enum values must be unique:
CREATE TYPE breed AS ENUM ('maltese', 'maltese');
Create an enum using the unique string values in the my_varchar
column:
CREATE TYPE birds AS ENUM (SELECT my_varchar FROM my_inputs);
Show the available values in the birds
enum using the enum_range
function:
SELECT enum_range(NULL::birds) AS my_enum_range;
my_enum_range |
---|
[duck, goose] |
Enum Usage
After an enum has been created, it can be used anywhere a standard built-in type is used. For example, we can create a table with a column that references the enum.
Creates a table person
, with attributes name
(string type) and current_mood
(mood type):
CREATE TABLE person (
name TEXT,
current_mood mood
);
Inserts tuples in the person
table:
INSERT INTO person
VALUES ('Pedro', 'happy'), ('Mark', NULL), ('Pagliacci', 'sad'), ('Mr. Mackey', 'ok');
The following query will fail since the mood type does not have quackity-quack
value.
INSERT INTO person
VALUES ('Hannes', 'quackity-quack');
The string sad
is cast to the type mood
, returning a numerical reference value.
This makes the comparison a numerical comparison instead of a string comparison.
SELECT *
FROM person
WHERE current_mood = 'sad';
name | current_mood |
---|---|
Pagliacci | sad |
If you are importing data from a file, you can create an enum for a VARCHAR
column before importing.
Given this, the following subquery selects automatically selects only distinct values:
CREATE TYPE mood AS ENUM (SELECT mood FROM 'path/to/file.csv');
Then you can create a table with the enum type and import using any data import statement:
CREATE TABLE person (name TEXT, current_mood mood);
COPY person FROM 'path/to/file.csv';
Enums vs. Strings
DuckDB enums are automatically cast to VARCHAR
types whenever necessary. This characteristic allows for enum columns to be used in any VARCHAR
function. In addition, it also allows for comparisons between different enum columns, or an enum and a VARCHAR
column.
For example:
Regexp_matches is a function that takes a VARCHAR, hence current_mood is cast to VARCHAR:
SELECT regexp_matches(current_mood, '.*a.*') AS contains_a
FROM person;
contains_a |
---|
true |
NULL |
true |
false |
Create a new mood and table:
CREATE TYPE new_mood AS ENUM ('happy', 'anxious');
CREATE TABLE person_2 (
name text,
current_mood mood,
future_mood new_mood,
past_mood VARCHAR
);
Since the current_mood
and future_mood
columns are constructed on different enum types, DuckDB will cast both enums to strings and perform a string comparison:
SELECT *
FROM person_2
WHERE current_mood = future_mood;
When comparing the past_mood
column (string), DuckDB will cast the current_mood
enum to VARCHAR
and perform a string comparison:
SELECT *
FROM person_2
WHERE current_mood = past_mood;
Enum Removal
Enum types are stored in the catalog, and a catalog dependency is added to each table that uses them. It is possible to drop an enum from the catalog using the following command:
DROP TYPE enum_name;
Currently, it is possible to drop enums that are used in tables without affecting the tables.
Warning This behavior of the enum removal feature is subject to change. In future releases, it is expected that any dependent columns must be removed before dropping the enum, or the enum must be dropped with the additional
CASCADE
parameter.
Comparison of Enums
Enum values are compared according to their order in the enum's definition. For example:
CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
SELECT 'sad'::mood < 'ok'::mood AS comp;
comp |
---|
true |
SELECT unnest(['ok'::mood, 'happy'::mood, 'sad'::mood]) AS m
ORDER BY m;
m |
---|
sad |
ok |
happy |
Functions
See Enum Functions.