The operational companion to database design. While database-designer focuses on schema architecture and database-schema-designer handles ERD modeling, this skill covers the day-to-day: writing queries, optimizing performance, generating migrations, and bridging the gap between application code and database engines.
| Script | Purpose |
|---|---|
scripts/query_optimizer.py |
Static analysis of SQL queries for performance issues |
scripts/migration_generator.py |
Generate migration file templates from change descriptions |
scripts/schema_explorer.py |
Generate schema documentation from introspection queries |
When converting requirements to SQL, follow this sequence:
Top-N per group (window function)
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rn
FROM employees
) ranked WHERE rn <= 3;
Running totals
SELECT date, amount,
SUM(amount) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM transactions;
Gap detection
SELECT curr.id, curr.seq_num, prev.seq_num AS prev_seq
FROM records curr
LEFT JOIN records prev ON prev.seq_num = curr.seq_num - 1
WHERE prev.id IS NULL AND curr.seq_num > 1;
UPSERT (PostgreSQL)
INSERT INTO settings (key, value, updated_at)
VALUES ('theme', 'dark', NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = EXCLUDED.updated_at;
UPSERT (MySQL)
INSERT INTO settings (key_name, value, updated_at)
VALUES ('theme', 'dark', NOW())
ON DUPLICATE KEY UPDATE value = VALUES(value), updated_at = VALUES(updated_at);
See references/query_patterns.md for JOINs, CTEs, window functions, JSON operations, and more.
PostgreSQL — list tables and columns
SELECT table_name, column_name, data_type, is_nullable, column_default
FROM information_schema.columns
WHERE table_schema = 'public'
ORDER BY table_name, ordinal_position;
PostgreSQL — foreign keys
SELECT tc.table_name, kcu.column_name,
ccu.table_name AS foreign_table, ccu.column_name AS foreign_column
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage ccu ON tc.constraint_name = ccu.constraint_name
WHERE tc.constraint_type = 'FOREIGN KEY';
MySQL — table sizes
SELECT table_name, table_rows,
ROUND(data_length / 1024 / 1024, 2) AS data_mb,
ROUND(index_length / 1024 / 1024, 2) AS index_mb
FROM information_schema.tables
WHERE table_schema = DATABASE()
ORDER BY data_length DESC;
SQLite — schema dump
SELECT name, sql FROM sqlite_master WHERE type = 'table' ORDER BY name;
SQL Server — columns with types
SELECT t.name AS table_name, c.name AS column_name,
ty.name AS data_type, c.max_length, c.is_nullable
FROM sys.columns c
JOIN sys.tables t ON c.object_id = t.object_id
JOIN sys.types ty ON c.user_type_id = ty.user_type_id
ORDER BY t.name, c.column_id;
Use scripts/schema_explorer.py to produce markdown or JSON documentation:
python scripts/schema_explorer.py --dialect postgres --tables all --format md
python scripts/schema_explorer.py --dialect mysql --tables users,orders --format json --json
WHERE status = 'active')| Anti-Pattern | Rewrite |
|---|---|
SELECT * FROM orders |
SELECT id, status, total FROM orders (explicit columns) |
WHERE YEAR(created_at) = 2025 |
WHERE created_at >= '2025-01-01' AND created_at < '2026-01-01' (sargable) |
| Correlated subquery in SELECT | LEFT JOIN with aggregation |
NOT IN (SELECT ...) with NULLs |
NOT EXISTS (SELECT 1 ...) |
UNION (dedup) when not needed |
UNION ALL |
LIKE '%search%' |
Full-text search index (GIN/FULLTEXT) |
ORDER BY RAND() |
Application-side random sampling or TABLESAMPLE |
Symptoms:
Fixes:
include in Prisma, joinedload in SQLAlchemy)WHERE id IN (...)
python scripts/query_optimizer.py --query "SELECT * FROM orders WHERE status = 'pending'" --dialect postgres
python scripts/query_optimizer.py --query queries.sql --dialect mysql --json
See references/optimization_guide.md for EXPLAIN plan reading, index types, and connection pooling.
Adding a column (safe)
-- Up
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Down
ALTER TABLE users DROP COLUMN phone;
Renaming a column (expand-contract)
-- Step 1: Add new column
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
-- Step 2: Backfill
UPDATE users SET full_name = name;
-- Step 3: Deploy app reading both columns
-- Step 4: Deploy app writing only new column
-- Step 5: Drop old column
ALTER TABLE users DROP COLUMN name;
Adding a NOT NULL column (safe sequence)
-- Step 1: Add nullable
ALTER TABLE orders ADD COLUMN region VARCHAR(50);
-- Step 2: Backfill with default
UPDATE orders SET region = 'unknown' WHERE region IS NULL;
-- Step 3: Add constraint
ALTER TABLE orders ALTER COLUMN region SET NOT NULL;
ALTER TABLE orders ALTER COLUMN region SET DEFAULT 'unknown';
Index creation (non-blocking, PostgreSQL)
CREATE INDEX CONCURRENTLY idx_orders_status ON orders (status);
Every migration must have a reversible down script. For irreversible changes:
pg_dump the affected tablespython scripts/migration_generator.py --change "add email_verified boolean to users" --dialect postgres --format sql
python scripts/migration_generator.py --change "rename column name to full_name in customers" --dialect mysql --format alembic --json
| Feature | PostgreSQL | MySQL | SQLite | SQL Server |
|---|---|---|---|---|
| UPSERT | ON CONFLICT DO UPDATE |
ON DUPLICATE KEY UPDATE |
ON CONFLICT DO UPDATE |
MERGE |
| Boolean | Native BOOLEAN |
TINYINT(1) |
INTEGER |
BIT |
| Auto-increment | SERIAL / GENERATED |
AUTO_INCREMENT |
INTEGER PRIMARY KEY |
IDENTITY |
| JSON | JSONB (indexed) |
JSON |
Text (ext) | NVARCHAR(MAX) |
| Array | Native ARRAY |
Not supported | Not supported | Not supported |
| CTE (recursive) | Full support | 8.0+ | 3.8.3+ | Full support |
| Window functions | Full support | 8.0+ | 3.25.0+ | Full support |
| Full-text search | tsvector + GIN |
FULLTEXT index |
FTS5 extension | Full-text catalog |
| LIMIT/OFFSET | LIMIT n OFFSET m |
LIMIT n OFFSET m |
LIMIT n OFFSET m |
OFFSET m ROWS FETCH NEXT n ROWS ONLY |
information_schema varies between engines'YYYY-MM-DD' works everywhereSchema definition
model User {
id Int @id @default(autoincrement())
email String @unique
name String?
posts Post[]
createdAt DateTime @default(now())
}
model Post {
id Int @id @default(autoincrement())
title String
author User @relation(fields: [authorId], references: [id])
authorId Int
}
Migrations: npx prisma migrate dev --name add_user_email
Query API: prisma.user.findMany({ where: { email: { contains: '@' } }, include: { posts: true } })
Raw SQL escape hatch: prisma.$queryRaw\SELECT * FROM users WHERE id = ${userId}``
Schema-first definition
export const users = pgTable('users', {
id: serial('id').primaryKey(),
email: varchar('email', { length: 255 }).notNull().unique(),
name: text('name'),
createdAt: timestamp('created_at').defaultNow(),
});
Query builder: db.select().from(users).where(eq(users.email, email))
Migrations: npx drizzle-kit generate:pg then npx drizzle-kit push:pg
Entity decorators
@Entity()
export class User {
@PrimaryGeneratedColumn()
id: number;
@Column({ unique: true })
email: string;
@OneToMany(() => Post, post => post.author)
posts: Post[];
}
Repository pattern: userRepo.find({ where: { email }, relations: ['posts'] })
Migrations: npx typeorm migration:generate -n AddUserEmail
Declarative models
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
email = Column(String(255), unique=True, nullable=False)
name = Column(String(255))
posts = relationship('Post', back_populates='author')
Session management: Always use with Session() as session: context manager
Alembic migrations: alembic revision --autogenerate -m "add user email"
See references/orm_patterns.md for side-by-side comparisons and migration workflows per ORM.
| Level | Dirty Read | Non-Repeatable Read | Phantom Read | Use Case |
|---|---|---|---|---|
| READ UNCOMMITTED | Yes | Yes | Yes | Never recommended |
| READ COMMITTED | No | Yes | Yes | Default for PostgreSQL, general OLTP |
| REPEATABLE READ | No | No | Yes (InnoDB: No) | Financial calculations |
| SERIALIZABLE | No | No | No | Critical consistency (billing, inventory) |
pg_advisory_lock() for application-level coordination# Full backup
pg_dump -Fc --no-owner dbname > backup.dump
# Restore
pg_restore -d dbname --clean --no-owner backup.dump
# Point-in-time recovery: configure WAL archiving + restore_command
# Full backup
mysqldump --single-transaction --routines --triggers dbname > backup.sql
# Restore
mysql dbname < backup.sql
# Binary log for PITR: mysqlbinlog --start-datetime="2025-01-01 00:00:00" binlog.000001
# Backup (safe with concurrent reads)
sqlite3 dbname ".backup backup.db"
| Anti-Pattern | Problem | Fix |
|---|---|---|
SELECT * |
Transfers unnecessary data, breaks on schema changes | Explicit column list |
| Missing indexes on FK columns | Slow JOINs and cascading deletes | Add indexes on all foreign keys |
| N+1 queries | 1 + N round trips to database | Eager loading or batch queries |
| Implicit type coercion | WHERE id = '123' prevents index use |
Match types in predicates |
| No connection pooling | Exhausts connections under load | PgBouncer, ProxySQL, or ORM pool |
| Unbounded queries | No LIMIT risks returning millions of rows | Always paginate |
| Storing money as FLOAT | Rounding errors | Use DECIMAL(19,4) or integer cents |
| God tables | One table with 50+ columns | Normalize or use vertical partitioning |
| Soft deletes everywhere | Complicates every query with WHERE deleted_at IS NULL |
Archive tables or event sourcing |
| Raw string concatenation | SQL injection | Parameterized queries always |
| Skill | Relationship |
|---|---|
| database-designer | Schema architecture, normalization analysis, ERD generation |
| database-schema-designer | Visual ERD modeling, relationship mapping |
| migration-architect | Complex multi-step migration orchestration |
| api-design-reviewer | Ensuring API endpoints align with query patterns |
| observability-platform | Query performance monitoring, slow query alerts |