10. C#-style LINQ query syntax (%linq!)
daslib/linq_das adds a C#-like query-expression form through the
%linq! … %% inline reader macro. A query is a purely mechanical,
compile-time rewrite into a _fold(...) chain, so it rides the same fused
execution engine as the pipe-form LINQ operators (see
LINQ-fold patterns — what _fold(...) recognizes) with no runtime cost over the hand-written chain.
require daslib/linq_das
var names <- %linq! from c in cars where c.price > 100 select c.name %%
rewrites to, and is re-parsed in place as:
var names <- ( _fold( each(cars) |> _where($(c) => c.price > 100) |> _select($(c) => c.name) |> to_array() ) )
The macro lives in the lexer’s inline reader-macro slot (%name!), so a
query is an ordinary expression — it can be assigned, passed as an argument, or
embedded in a larger expression.
10.1. Clauses
A query is from <var> [ : <Row> ] in <src> ( where <pred> )* [ ( join <var2>
[ : <Row2> ] in <src2> on <keyA> equals <keyB> [ into <g> ] | from <var2> [ : <Row2> ] in
<src2> ) ] ( where <pred> )* [ orderby <expr> [ascending|descending] (, <expr> [ascending|descending])* ] ( select <proj> |
group <var> by <key> ) [ into <var> <continuation> ] [ iterator ] — a second
range variable comes from either a join or a second from (never
both); each where slot accepts any number of predicates (each AND-folds in
source order). into has two forms: join … equals … into <g> is
a group join (g = the array of matching right rows, in scope with the
left variable — see Join), while a trailing into <var> after
the terminal is a query continuation that rebinds the stage’s output and
continues from there (single-source only — see Query continuation (into)).
Separately, a let <name> = <expr> binding may appear any number of times
between body clauses — it is inlined away before the rest is parsed (see
Let bindings):
from <var> in <source>— the element bind<var>names the per-row value. With no type annotation,<source>is anarray<T>.let <name> = <expr>— optional, repeatable, and free to appear between any body clauses; binds a computed value reused in the clauses that follow it (see Let bindings).where <predicate>— optional and repeatable. Awherebefore thejoin/ secondfromfilters the left source (single range var); awhereafter it sees both range variables. Severalwhereclauses may appear in either slot — each emits its own filter, AND-folded in source order (over a SQL source they push down as one ANDedWHERE). Awherewritten after theorderbyfilters the sorted sequence — identical, for a total order, to filtering first — so it emits ahead of the sort.join <var2> in <src2> on <keyA> equals <keyB>— optional, a single inner equi-join introducing a second range variable (see Join).from <var2> in <src2>— optional, a secondfromintroducing a second range variable — SelectMany: an independent source is the cross product, a source that is a field of the first range variable (from l in o.lines) is the correlated flatten (see Multiple from (SelectMany)).orderby <expr> [descending], …— optional; one or more comma-separated sort keys, each with its own direction (see Ordering). Omitted when absent.select <projection>—select <var>(the identity projection) returns the rows unchanged; any other projection emits_select(...).group <var> by <key>— the alternative terminal toselect(see Grouping).into <var>— optional query continuation after the terminal: rebinds the prior stage’s output to<var>and continues with more clauses, all on the same fused chain (see Query continuation (into)).
Each stage ends with either select or group … by — exactly one.
Clauses may span multiple lines inside the %linq! … %% body.
10.2. Sources
An untyped from c in <arr> is an array source. A typed range
variable from c : Row in <src> selects a non-array source — the row type
Row is supplied on the range variable (C#-faithful from Type c in src)
because the source value alone does not carry it:
// array (untyped) — `each(arr)`
var a <- %linq! from c in cars where c.price > 100 select c.name %%
// decs — the `decs` keyword marker → `from_decs_template(type<CarComp>)`
var d <- %linq! from c : CarComp in decs where c.price > 100 select c.name %%
// SQL — a SqlRunner value → `select_from`, pushed down by `_sql`
var s <- %linq! from c : Car in db where c.price > 100 select c.name %%
// XML — an xml_node value → `from_xml_node`, fused by the XmlAdapter
var x <- %linq! from c : Car in doc.document_element where c.price > 100.0 select c.brand %%
// JSON — a JsonValue? array → `from_json`, fused by the JsonAdapter
var j <- %linq! from c : Car in carsJson where c.price > 100 select c.name %%
For value sources (SQL, XML, JSON) the reader emits
from_in(<src>, type<Row>); the from_in call macro dispatches on the
source value’s type to the concrete builder (so a new backend is a new
from_in branch, never a parser change). decs has no source value, so it
is emitted directly as from_decs_template and never goes through
from_in. The row type’s required annotation depends on the source —
[decs_template] for decs, [sql_table] / [sql_view] for SQL, a plain
struct for XML and JSON. The JSON source is a JsonValue? holding a JSON
array of objects (from c : Car in jv["cars"] descends into a nested
array first); like the XML source, each element materializes by name —
top-level fields read by key, field-pruned to just the keys the chain reads. A
custom whole-row from_JV(Row) override is not honored (this is a flat
query source, not a deserializer); to query through from_JV instead,
materialize the array first
([for (e in jv.value as _array); from_JV(e, type<Row>)]) and query that.
10.3. Range variable
The range variable is spliced verbatim as the lambda parameter — the
predicate becomes _where($(c) => …) and the projection _select($(c) =>
…), keeping the range-variable name; the predicate and projection text is passed
through unchanged. Any identifier name works.
The _fold operator DSL accepts a named-variable $(x) => … block
directly. For the SQL source, _sql resolves a single source against the
placeholder _; the macro normalizes the single-source lambda parameter to
_ internally, so the range-variable name is still spliced verbatim at the
surface.
10.4. Filtering (where)
A where clause is optional and repeatable (as in C#) — each emits its
own _where filter, AND-folded in source order:
// two predicates — both apply
var names <- %linq! from c in cars where c.price > 100 where c.brand == "eco" select c.name %%
// expands to: _fold( each(cars) |> _where($(c) => c.price > 100) |> _where($(c) => c.brand == "eco") |> _select($(c) => c.name) |> to_array() )
Over a SQL source the predicates push down as one ANDed WHERE (a single
statement, no intermediate materialize). On a two-source query (join / second
from) the slot still applies: wheres before the second source filter
the left source (and push to SQL), wheres after it filter the carried
pair. A where written after the orderby filters the sorted sequence —
for a total order that is identical to filtering first, so it emits ahead of the
sort.
10.5. Let bindings
let <name> = <expr> introduces a computed value (what C# calls a new range variable)
that is reused in the clauses that follow it:
var rows <- %linq! from c in cars
let net = c.price - tax(c)
where net < 100
orderby net
select (Name = c.name, Net = net) %%
The binding is inlined textually: every later reference to net is
replaced with (c.price - tax(c)), so the query is exactly equivalent to
writing the expression out at each use site. Bindings are repeatable and chain —
a later let may reference an earlier one — and they may appear in any body
context (a single from, after a join referencing both range variables,
or in a from … from SelectMany):
// chained — `net` uses the earlier `disc`
var rows <- %linq! from c in cars
let disc = c.price / 10
let net = c.price - disc
orderby net descending select (N = c.name, Net = net) %%
Because the binding is inlined, a let over a SQL source pushes its
computed expression down: a binding used in where / orderby / select
becomes the computed predicate / key / column _sql renders directly (the
whole query stays a single SELECT). The binding name must differ from every
range variable (including a second one from a join / second from) and
from any earlier binding, a let must precede the select / group
terminal, and an inlined expression is re-evaluated at each use site (a textual
inline, not a cached temporary).
10.6. Projections
The select clause projects each (joined) row into the result element. Four
projection forms are supported:
Scalar — a single field or expression:
select c.name→array<string>;select c.price * 2→array<int>.Named tuple — the C#
select new { … }analog, and the way to return more than one column:select (Name = c.name, City = c.city)→ an array oftuple<Name:string; City:string>; read fields by name (row.Name).Whole-row / identity —
select creturns the rows unchanged (array<Row>); it emits no_selectstage. Over a join it yields the left row. Over a SQL source whole-rowselect cis in-memory only — a row has no column form, so project columns (scalar or named tuple) to push down.String interpolation —
select "{c.name}:{b.country}"→array<string>. The range variables are rewritten inside the{ … }interpolation (so bothcandbresolve), letting you format a row to a string in one step.
// scalar
var names <- %linq! from c in cars select c.name %%
// named tuple — two columns, read row.Name / row.Price
var rows <- %linq! from c in cars select (Name = c.name, Price = c.price) %%
// whole row (identity) — the filtered structs, unchanged
var kept <- %linq! from c in cars where c.price > 100 select c %%
// string interpolation — range vars rewritten inside {…}
var labels <- %linq! from c in cars select "{c.name} @ {c.price}" %%
10.7. Ordering
orderby <expr> [descending|ascending], … sorts by one or more keys, each
with its own direction. A single key emits _order_by($(c) => <expr>) (or
_order_by_descending(...)); multiple comma-separated keys emit one
_order_by_keys($(c) => (k1, k2, …), <descMask>) — a single composite stable
sort, where descMask bit i (LSB = first key) marks key i descending.
descending (and the default-explicit ascending) are recognized as trailing
keywords per key:
// single key, ascending (default)
var byPrice <- %linq! from c in cars orderby c.price select c.name %%
// single key, descending, after a where
var top <- %linq! from c in cars where c.price > 100 orderby c.price descending select c.name %%
// multi-key with mixed directions: brand ascending, then price descending
var rows <- %linq! from c in cars orderby c.brand, c.price descending select c %%
Works over all five sources: SQL emits ORDER BY c1, c2 DESC, …; array / decs / XML / JSON
sort the materialized rows. Multi-key ordering is stable (C# OrderBy / ThenBy
parity — rows equal on every key keep input order) and supports at most four keys.
Single-key ordering is unchanged — it keeps its existing (unstable) sort, so there is
no performance regression on the common single-key case.
10.8. Grouping
group <var> by <key> is a terminal (it replaces select). It emits
_group_by_lazy($(c) => <key>) and yields one tuple<key; array<elem>>
bucket per distinct key — the C# IGrouping shape. Read the key as ._0 and
the group’s elements as ._1:
var byBrand <- %linq! from c in cars group c by c.brand %%
for (g in byBrand) {
print("{g._0}: {g._1 |> length} cars\n") // key, then count of that bucket
}
A where may precede the group; orderby may not directly (order the
groups in an into continuation instead — see Query continuation (into)). The group
element must be the range variable (group c by …) — element selectors are not
yet supported.
A bare group … by (no continuation) keeps the whole (key, [rows])
group, so it is an in-memory feature (array / decs / XML / JSON); over a SQL
source it is rejected (SQL GROUP BY has no all-rows-per-group form). To
aggregate per group — the common case — add an into continuation
(group c by k into g select (…, g |> length, g |> select(…) |> sum)), which
does push down to SQL GROUP BY (see Query continuation (into)). (Over decs
these minimal orderby / group chains currently materialize rather than
fuse — correct, but a _fold perf advisory fires; a decs-adapter gap, not a
query-syntax one.)
10.9. Query continuation (into)
into <var> rebinds the prior stage’s output to a new range variable and
continues the query with more clauses — all on the same fused _fold
chain, with no materialization between stages. It is supported on single-source
queries. C# uses it for grouped aggregation and to chain query stages.
Group continuation — after group c by k into g, the new range variable
g is the group. With the A2 convention, g.key is the key and a bare
g is the member collection:
var report <- %linq! from c in cars
group c by c.brand into g
select (brand = g.key,
count = g |> length,
total = g |> select($(u : Car) => u.price) |> sum) %%
A continuation may itself contain where / orderby / select / group
over the groups — e.g. where g |> length > 1 (drop singleton buckets) or
orderby g |> select($(u : Car) => u.price) |> sum descending (order buckets by
total).
Select continuation — select <proj> into n rebinds the projected value to
n and continues:
var kept <- %linq! from c in cars select c.price into p where p > 100 select p %%
Continuations chain (… into x … into y …), so a query can group, aggregate,
then filter/order the aggregated rows in one fused pass.
SQL pushdown. A group continuation whose select is aggregate-only —
g.key plus g |> length (→ COUNT(*)), g |> select(…) |> sum/average/min/max
(→ SUM/AVG/MIN/MAX), or g |> first() — pushes down to a SQL GROUP BY
over a SQL source, exactly like the hand-written _group_by pipe form. A
member-keeping continuation (identity select g, which keeps the whole
(key, [rows]) group) has no SQL form and is in-memory only (array / decs /
XML / JSON) — over a SQL source it is rejected, like a bare group … by.
10.10. Join
join <var2> [ : <Row2> ] in <src2> on <keyA> equals <keyB> adds a single
inner equi-join — one new range variable, one equality key. The second
source is built exactly like the first (untyped → array, typed → the
from_in dispatch), so it may be a different kind of source than the left.
The reader picks one of two emit shapes from the post-join clauses (it transpiles before type inference and cannot see the source, so it decides textually):
Select-terminal — no post-join where / orderby, terminal is
select. The select projection is the join’s result row (both range
variables are in scope), so it splices verbatim. A scalar or named-tuple
projection pushes down to SQL; a whole-row select c is in-memory only
(over SQL it has no column form — project columns instead):
var rows <- %linq! from c in cars join b in brands on c.brand equals b.brand
select (Name = c.name, Country = b.country) %%
A where before the join filters the left source (single range var) and
also pushes down — over an array/decs/XML/JSON source it fuses into the join’s probe
loop (no intermediate filtered array). Several pre-join wheres AND-fold
(see Filtering (where)):
var rows <- %linq! from c in cars where c.price >= 150 join b in brands
on c.brand equals b.brand select (Name = c.name, Country = b.country) %%
Transparent identifier — a post-join where (one or more) / orderby,
or a group terminal. The join carries (c, b) as a pair so the later
clauses can address both variables; the reader rewrites c / b to the
carried fields, and each post-join where becomes its own filter. This is
in-memory only (array / decs / XML / JSON) — over a SQL source the carried
whole-row tuple has no column form and _sql rejects it (project columns in a
select-terminal join, or filter pre-join, to push down):
// post-join where sees the joined row (both c and b)
var usa <- %linq! from c in cars join b in brands on c.brand equals b.brand
where b.country == "USA" select c.name %%
// group the joined pairs by the right-side key → IGrouping whose elements
// are the (c, b) pairs (read g._0 = key, g._1 = array of pairs)
var byCountry <- %linq! from c in cars join b in brands on c.brand equals b.brand
group c by b.country %%
Group join (join … equals … into <g>) — C# GroupJoin. into g
binds g to the array of matching right rows alongside the left range
variable; the terminal select reads both. It is outer — every left row
surfaces, an unmatched one paired with an empty group. The reader emits
_group_join, which fuses through the same join splice (a pre-join where
included), so the per-group aggregate runs in one hash-build + probe with no
intermediate:
// every brand with how many cars it has — a carless brand surfaces with 0
var perBrand <- %linq! from b in brands join c in cars on b.brand equals c.brand into g
select (Brand = b.brand, N = g |> length) %%
// aggregate over the group (sum of the matching cars' prices)
var totals <- %linq! from b in brands join c in cars on b.brand equals c.brand into g
select (Brand = b.brand, Total = g |> select($(c : Car) => c.price) |> sum) %%
join … into is select-terminal + a pre-join ``where`` + a trailing
``iterator`` only, and array sources only: _group_join has no SQL
push-down (over a SQL source it rejects — write the aggregate in raw SQL
instead), and decs / XML / JSON group-joins are not yet fused. A post-into where
/ orderby / group over the (left, g) pair is rejected — g is a
non-copyable array that can’t ride the transparent-identifier carry; materialize
then transform, or drop to the pipe-form _group_join.
Only a single equi-key is supported — composite keys (a equals b && c
equals d), multiple joins, and orderby before group are rejected at
compile time.
10.11. Multiple from (SelectMany)
A second from introduces a second range variable over an independent
source — the cross product, C#’s SelectMany:
from c in cars from b in brands select …
is every (c, b) pair. It shares the whole post-source clause grammar with
join (pre/post where, orderby, select / group, transparent
identifier, iterator) — it is exactly a join with no on … equals key,
and emits _cross_join instead of _join. The same two emit shapes apply:
Select-terminal — the select projection is the cross’s result row (both
range variables in scope). Over a SQL source it pushes down to a SQL CROSS
JOIN; a scalar / named-tuple projection has a column form, a whole-row
select c is in-memory only:
// 3 cars × 2 brands = 6 rows
var rows <- %linq! from c in cars from b in brands
select (Name = c.name, Country = b.country) %%
A where before the second from filters the left source (single range
var) and pushes down; cross-then-filter on a key equality is the equi-join
subset. Both slots are repeatable (see Filtering (where)):
// pre-from where filters cars, then crosses
var rows <- %linq! from c in cars where c.price >= 150 from b in brands
select (Name = c.name, Country = b.country) %%
// post-from where sees both vars (transparent identifier) — cross-then-filter
var matched <- %linq! from c in cars from b in brands
where c.brand == b.brand select c.name %%
Transparent identifier — a post-from where / orderby or a group
terminal carries (c, b) as a pair, in-memory only (same SQL boundary as
join).
10.12. Iterator vs array output
By default a query materializes to an array<T> (via to_array()). A
trailing iterator keyword yields an iterator<T> instead (via
to_sequence()), for feeding a for loop or another pipeline without
binding a stored array (the chain still materializes internally — see below):
// array (default)
var names <- %linq! from c in cars where c.price > 100 select c.name %%
// iterator — consume without binding a stored array
for (nm in %linq! from c in cars where c.price > 100 select c.name iterator %%) {
print("{nm}\n")
}
The iterator form is an iterator over the optimized (fused / pushed-down)
result — it preserves each source’s fusion, not lazy per-element streaming.
T is the projection type, or the source element type for an identity
select.
10.14. Current limitations
The following are not yet supported:
Multi-key ``orderby`` (
orderby a, b descending) — a single sort key only, for now.``group … by`` over a SQL source, and the
group … intoaggregate continuation (SQL grouping ridesinto).Composite / multiple / group joins —
joinis a single inner equi-join with one key; composite keys (a equals b && c equals d),join … intogroup-joins, and more than onejoinper query are rejected.Post-join ``where`` / ``orderby`` / ``group`` over a SQL source — these use the transparent-identifier carry, which is in-memory only; select-terminal joins and pre-join
wherepush down.Correlated multiple ``from`` over a non-array source — the flattening SelectMany (
from l in o.lines) is supported for in-memory array sources; a SQL source rejects it (no table for the per-row collection), and XML / decs have no nested-collection shape.orderby/groupover a correlated flatten, and a post-fromwherereferencing the outer variable, are rejected (the non-copyable outer can’t ride the transparent-identifier carry); an inner-only post-fromwhereis pushed into the collection. The uncorrelated form (independent second source → cross product) is supported on all sources (see Multiple from (SelectMany)). N-aryfrom … from … fromandfrom … fromcombined withjoinare also rejected.``into`` continuations — not yet supported (
letbindings are supported; see Let bindings).
10.13. Comments in the body
Line (
// …) and block (/* … */) comments inside the query body are stripped before parsing (replaced with spaces, newlines preserved), so a keyword or the range variable mentioned inside a comment never confuses the clause splitter and never leaks into the spliced chain. String literals are not treated as comments.