Fastest path from schema drift to safe change: define schemas in HCL, detect drift, generate SQL, and ship database + streaming updates with confidence.
Supports AMD64 and ARM64. Installs to ~/.local/bin/kolumn
Supports Intel and Apple Silicon. Installs to ~/.local/bin/kolumn
kolumn.exe and add to PATHkolumn version to verifywsl --install# main.kl - Define your database schema provider "postgres" { host = "localhost" port = 5432 database = "myapp" username = var.db_user password = var.db_password } create "postgres_table" "users" { name = "users" schema = "public" columns = { id = { type = "bigserial" primary_key = true } email = { type = "varchar(255)" not_null = true unique = true } created_at = { type = "timestamptz" default = "now()" } } }
From HCL to SQL with drift detection, risk scoring, controlled apply, and state.
kolumn plan diffs HCL vs live DB, finds drift/unmanaged objects.
Builds ordered DDL (CREATE/ALTER) plus rollback SQL for every step.
Shows a plan summary: create table users, add unique index on email, etc.
Each change is labeled low/medium/high (drops, rewrites, locking changes).
High-risk steps call out why: "Drop column email is destructive", "Rewrite table".
You decide to proceed, edit config, or split the change before applying.
Runs the generated SQL in order with transactional safety where supported.
Stops on errors, respects lock-sensitive operations, surfaces rollback SQL.
Reports success/failure per statement; warns on partial applies if any.
Kolumn records the post-apply state (tables, columns, checks, indexes).
Future plan runs diff against state + live DB to find new drift.
Plan/apply logs with timestamps and SQL are kept for review/compliance.
Stream CDC from MSSQL into Snowflake with auto-schema creation, merge batching, and drift-safe routing filters.
# main.kl - stream CDC from MSSQL into Snowflake (end-to-end) provider "mssql" { host = "mssql.internal" database = "orders" username = var.mssql_user password = var.mssql_password } provider "snowflake" { account = var.snowflake_account database = "ANALYTICS" schema = "CDC" warehouse = "COMPUTE_WH" role = "CDC_SYNC" } # Source table with PII column create "mssql_table" "users" { name = "users" schema = "dbo" columns = { id = { type = "uniqueidentifier" default = "NEWSEQUENTIALID()" primary_key = true } email = { type = "varchar(255)" not_null = true unique = true } created_at = { type = "datetime2" default = "SYSUTCDATETIME()" } } } # Snowflake roles/users and masking for PII create "snowflake_role" "cdc_consumer" { name = "CDC_CONSUMER" } create "snowflake_user" "cdc_sync_user" { name = "CDC_SYNC_USER" default_role = snowflake_role.cdc_consumer.name password = var.snowflake_password } create "snowflake_masking_policy" "mask_email" { name = "MASK_EMAIL" body = "CASE WHEN CURRENT_ROLE() = 'CDC_CONSUMER' THEN email ELSE REGEXP_REPLACE(email, '(.).+(@.*)', '\\\\1***\\\\2') END" } # Snowflake landing table with masking applied create "snowflake_table" "users_replica" { database = "ANALYTICS" schema = "CDC" name = "USERS" columns = { id = { type = "BINARY(16)" primary_key = true } email = { type = "VARCHAR" masking_policy = snowflake_masking_policy.mask_email.name } created_at = { type = "TIMESTAMP_NTZ" } } } create "stream_sink" "snowflake_replica" { type = "snowflake" connection_info = { account = var.snowflake_account user = "CDC_SYNC_USER" password = var.snowflake_password database = "ANALYTICS" schema = "CDC" warehouse = "COMPUTE_WH" } batch_mode = "merge" buffer_size = 1000 create_tables = true } create "stream_route" "mssql_orders_to_snowflake" { source = { type = "mssql_cdc" database = "orders" slot_name = "cdc_orders_slot" publication = "cdc_orders_publication" } sink = stream_sink.snowflake_replica.id filter { event_types = ["DML"] conditions = ["schema NOT IN ('cdc', 'sys')"] } throughput_targets { target_tps = 1000 worker_count = 1 } # Optional: allow DDL table creates from CDC (if desired) # filter { # event_types = ["DDL", "DML"] # conditions = ["operation IN ('CREATE TABLE','ALTER TABLE','DROP TABLE')"] # } }
PII columns (e.g., email) can be classified in Kolumn so Snowflake masking/role grants apply on arrival.
How events travel from source tables to Snowflake with Kolumn + streaming.
Source changes captured from orders via MSSQL CDC.
Events flow through the SchemaBounce streaming service, aligned to Kolumn config.
CDC rows land in ANALYTICS.CDC; downstream models consume curated views.
Kolumn tags columns (PII/financial) and propagates policies to Snowflake.
Snowflake roles, masking policies, and access filters generated from HCL.
Data objects track source → warehouse lineage for audits and impact analysis.
Per-step throughput, latency, retries, and queue depth exported to Grafana.
Failed events land in DLQ with filters + retry paths; exports for forensics.
Plan/apply logs + Kolumn state track schema and migration history end-to-end.
Create + discover + state/RPC wiring across databases, warehouses, lakehouses, and NoSQL/time-series engines.
Compare your config against live databases. Instantly see what changed, what's unmanaged, and what needs attention.
Review generated SQL before applying. Risk scoring highlights potentially dangerous changes so you can proceed with confidence.
Manage dev, staging, and production with the same config. Variables and environment files keep credentials separate.
Plans include rollback notes: simple creates map to drops; complex changes (type rewrites) include guidance and may need backups for full recovery.
Run plan in CI to catch drift early. Block risky PRs before they hit production. Integrate with GitHub Actions, GitLab CI, or any pipeline.
Every compare and apply is logged. Know who changed what and when for compliance and debugging.
Structured forward/backward migrations (no raw SQL): prechecks, provider-run steps, rollback paths, and optional quarantine/backup hints recorded in state.
Plans flag high-risk steps (drops/rewrites), enforce prechecks, and let you pause, approve, or split changes before apply.
13 engines supported with create + discover + state/RPC wiring: Postgres, MySQL, MSSQL, SQLite, CockroachDB, Snowflake, BigQuery, Databricks, Redshift, DuckDB, MongoDB, DynamoDB, InfluxDB.