7.10.43. SQL-42 — multi-version ETL with schema_from

This tutorial uses SQL-39 — schema_from: struct mirrors the DB as a building block. Read it first if you haven’t.

Imagine you’ve got Logs databases in the wild at v1 (no Severity column) and v2 (Severity added). You want a single program that reads from either, normalizes, and writes to a fresh DB. With [sql_table(schema_from=...)] the v1/v2 structs come from the actual schemas; with typeinfo has_field<...> + static_if one function body handles both at compile time. Drift in either source DB stops the build at the exact lines that need updating.

7.10.43.1. One struct per known historical shape

require daslib/sql
require sqlite/sqlite_boost
require sqlite/sqlite_linq

[sql_table(name = "Logs",
           schema_from = "tests/dasSQLITE/test_data/schema_from_v1.db")]
struct OldLogV1 {}

[sql_table(name = "Logs",
           schema_from = "tests/dasSQLITE/test_data/schema_from_v2.db")]
struct OldLogV2 {}

// The current shape — hand-declared, lives in code, evolves with the app.
[sql_table(name = "Logs")]
struct LogEntry {
    @sql_primary_key Id : int64
    Severity : int64
    Message  : string
}

7.10.43.2. Generic migration body with `static_if` branching

static_if (typeinfo has_field<Severity>(type<TT>)) lets one generic body handle both v1 and v2 sources. Compile-time branching; no runtime cost. If a future version drops Severity, the v1 branch kicks in automatically. If a version removes a field LogEntry still expects (Id or Message), the typing fails to compile at the assignment line.

def migrate_one_row(src : auto(TT)) : LogEntry {
    var dst : LogEntry
    dst.Id = src.Id
    dst.Message = src.Message
    static_if (typeinfo has_field<Severity>(type<TT>)) {
        dst.Severity = src.Severity
    } else {
        dst.Severity = 1l        // backfill: missing-Severity rows = info
    }
    return dst
}

7.10.43.3. Streaming with `_each_sql`

For large old DBs, materialize-then-iterate uses memory proportional to the result set. _each_sql is the streaming iterator — one row at a time, no full-array materialization.

[export]
def main() {
    with_sqlite(":memory:") <| $(newDb) {
        newDb |> create_table(type<LogEntry>)

        with_sqlite("tests/dasSQLITE/test_data/schema_from_v1.db") <| $(oldDb) {
            for (e in _each_sql(oldDb |> select_from(type<OldLogV1>))) {
                newDb |> insert(migrate_one_row(e))
            }
        }
        with_sqlite("tests/dasSQLITE/test_data/schema_from_v2.db") <| $(oldDb) {
            for (e in _each_sql(oldDb |> select_from(type<OldLogV2>))) {
                newDb |> insert(migrate_one_row(e))
            }
        }
    }
}

Same migrate_one_row, different compile-time branch.

7.10.43.4. When this is the right tool

schema_from + typeinfo has_field is for ETL, archival readers, one-shot data movement scripts. The script ships with the historical .db schemas baked in. If a new historical version surfaces, you add a third struct + a third loop and the type system tells you exactly where to wire it up.

For “my app’s schema evolves at runtime; I run migrations at startup”, a future daslib/sqlite_migrate module ships a different shape: versioned [sql_migration(version=N)] functions applied in order, tracked in __schema_version, transactional per migration. The two patterns coexist: schema_from for code-on-current-schema, sqlite_migrate for the schema-grows-over-time runner.

7.10.43. SQL-42 — multi-version ETL with schema_from

7.10.43.1. One struct per known historical shape

7.10.43.2. Generic migration body with static_if branching

7.10.43.3. Streaming with _each_sql

7.10.43.4. When this is the right tool

7.10.43.2. Generic migration body with `static_if` branching

7.10.43.3. Streaming with `_each_sql`