Design ClickHouse tables with correct engine selection, ORDER BY keys, partitioning, and codec choices for analytical workloads.
@clickhouse/client connected (see clickhouse-install-auth)| Engine | Best For | Dedup? | Example |
|---|---|---|---|
MergeTree |
General analytics, append-only logs | No | Clickstream, IoT |
ReplacingMergeTree |
Mutable rows (upserts) | Yes (on merge) | User profiles, state |
SummingMergeTree |
Pre-aggregated counters | Sums numerics | Page view counts |
AggregatingMergeTree |
Materialized view targets | Merges states | Dashboards |
CollapsingMergeTree |
Stateful row updates | Collapses +-1 | Shopping carts |
ClickHouse Cloud uses SharedMergeTree — it is a drop-in replacement for
MergeTree on Cloud. You do not need to change your DDL.
The ORDER BY clause is the single most important schema decision. It defines:
Rules of thumb:
event_type, status)user_id, tenant_id)created_at)-- Good: filter by tenant, then by time ranges
ORDER BY (tenant_id, event_type, created_at)
-- Bad: UUID first means every query scans the full index
ORDER BY (event_id, created_at) -- event_id is random UUID
CREATE TABLE analytics.events (
event_id UUID DEFAULT generateUUIDv4(),
tenant_id UInt32,
event_type LowCardinality(String),
user_id UInt64,
session_id String,
properties String CODEC(ZSTD(3)), -- JSON blob, compress well
url String CODEC(ZSTD(1)),
ip_address IPv4,
country LowCardinality(FixedString(2)),
created_at DateTime64(3) DEFAULT now64(3)
)
ENGINE = MergeTree()
ORDER BY (tenant_id, event_type, toDate(created_at), user_id)
PARTITION BY toYYYYMM(created_at)
TTL created_at + INTERVAL 1 YEAR
SETTINGS index_granularity = 8192;
CREATE TABLE analytics.users (
user_id UInt64,
email String,
plan LowCardinality(String),
mrr_cents UInt32,
properties String CODEC(ZSTD(3)),
updated_at DateTime DEFAULT now()
)
ENGINE = ReplacingMergeTree(updated_at) -- keeps latest row per ORDER BY key
ORDER BY user_id;
-- Query with FINAL to get deduplicated results
SELECT * FROM analytics.users FINAL WHERE user_id = 42;
CREATE TABLE analytics.daily_stats (
date Date,
tenant_id UInt32,
event_type LowCardinality(String),
event_count UInt64,
unique_users AggregateFunction(uniq, UInt64)
)
ENGINE = AggregatingMergeTree()
ORDER BY (tenant_id, event_type, date);
| Partition Expression | Typical Use | Parts Per Partition |
|---|---|---|
toYYYYMM(date) |
Most common — monthly | Target 10-1000 |
toMonday(date) |
Weekly rollups | More parts, finer drops |
toYYYYMMDD(date) |
Daily TTL drops | Many parts — use carefully |
| None | Small tables (<1M rows) | Fine |
Warning: Each partition creates separate parts on disk. Over-partitioning
(e.g., by user_id) creates millions of tiny parts and kills performance.
-- Column-level compression codecs
column1 UInt64 CODEC(Delta, ZSTD(3)), -- Time series / sequential IDs
column2 Float64 CODEC(Gorilla, ZSTD(1)), -- Floating point (similar values)
column3 String CODEC(ZSTD(3)), -- General text / JSON
column4 DateTime CODEC(DoubleDelta, ZSTD), -- Timestamps (near-sequential)
import { createClient } from '@clickhouse/client';
const client = createClient({ url: process.env.CLICKHOUSE_HOST! });
async function applySchema() {
await client.command({ query: 'CREATE DATABASE IF NOT EXISTS analytics' });
await client.command({
query: `
CREATE TABLE IF NOT EXISTS analytics.events (
event_id UUID DEFAULT generateUUIDv4(),
tenant_id UInt32,
event_type LowCardinality(String),
user_id UInt64,
payload String CODEC(ZSTD(3)),
created_at DateTime DEFAULT now()
)
ENGINE = MergeTree()
ORDER BY (tenant_id, event_type, created_at)
PARTITION BY toYYYYMM(created_at)
`,
});
console.log('Schema applied.');
}
| Error | Cause | Solution |
|---|---|---|
ORDER BY expression not in primary key |
PRIMARY KEY != ORDER BY | Remove explicit PRIMARY KEY or align |
Too many parts (300+) |
Over-partitioning | Use coarser partition expression |
Cannot convert String to UInt64 |
Wrong data type | Match insert types to schema |
TTL expression type mismatch |
TTL on non-date column | TTL must reference DateTime column |
For inserting and querying data, see clickhouse-core-workflow-b.