Rebase the Rust-port branch onto master: full merge of 64 commits + Rust-side ports#380
Draft
alcides wants to merge 17 commits into
Draft
Rebase the Rust-port branch onto master: full merge of 64 commits + Rust-side ports#380alcides wants to merge 17 commits into
alcides wants to merge 17 commits into
Conversation
Stand up a new `aeon-rs` Rust crate alongside the existing Python
package and rebuild the entire compiler core in Rust, exposed as a
Python extension module through PyO3. The Python layer becomes
re-export shims around the Rust implementations plus the small
orchestration glue that needs Python (parsing via lark, synthesis
via geneticengine / sklearn / ollama, LSP via pygls, LLVM via
llvmlite, runtime FFI library bindings).
AST & helpers
Name, FreshCounter, Kind, all term/type/liquid AST classes, sugar
AST (STerm/SType/Node hierarchy), locations.
Substitutions / lifts
Core and sugar substitutions, liquefy/inline_lets, term-in-type
walks, alpha-equivalence keys.
Verification
SMT encoder (PLE, flatten, native z3 solver layer — no z3-py),
Constraint hierarchy, Horn-clause solver, qualifier extraction,
constraint helpers (CNF, simplification, pretty-print).
Typechecking
TypingContext, bidirectional `synth`/`check`/`check_type`,
well-formedness, entailment, termination metric constraints,
linearity (QTT), partial VC builder, liquid type checker.
Sugar pipeline
Sugar AST helpers (equality, lifting, normalize, substitutions),
bind (name resolution), lowering (sugar→core),
desugar (algorithm + orchestration), elaboration (unification
with mutable polar variables), the algorithmic core of imports.
Runtime
Closure pyclass + eval loop (calls Python `eval` only for the
`native` FFI escape hatch).
Misc
Multiplicity (QTT semiring) singletons, liquid operator prelude,
core + surface pretty-printers, ANF / partial evaluator, decorator
dispatching (calls Python decorator bodies via FFI), inductive
constructor registry, static prelude constants.
The lark-based parser (in `aeon/sugar/parser.py` + `aeon/core/parser.py`),
the synthesis backends (genetic programming, synquid, tdsyn, lta,
decision_tree, llm — all ecosystem-locked to geneticengine, scikit-learn,
sympy, ollama), the LLVM codegen (`llvmlite`), the LSP server (`pygls`),
runtime FFI bindings (`aeon/bindings/{image,learning,table}.py`
called from `.ae` libraries via native FFI strings), the decorator
function bodies (`minimize`, `csv_data`, `gpu`, etc.), and the prelude
operator lambdas. Rust calls back into Python through PyO3 at exactly
these boundaries.
Adds `aeon-rs` as a `[tool.uv.workspace]` member and a build
dependency on `maturin`. `uv pip install -e ".[dev]"` builds the
Rust crate as a Python extension. `maturin develop --release
--manifest-path aeon-rs/Cargo.toml` rebuilds in place during
development.
All 974 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…cking
A new backend, `rust_enum` (registered alongside `gp`, `synquid`,
`tdsyn`, etc.), generates random core Aeon terms in Rust, rejects
those that don't type-check, evaluates the survivors via the
existing Python fitness pipeline, and maintains a Pareto front of
non-dominated individuals over the entire budget.
`aeon-rs/src/rust_enum_synth.rs` (~700 LOC) implements the
`RustEnumSynthesizer` pyclass:
- **Type-directed term generation**: peels refinements and
polymorphism wrappers, then matches on the target's shape:
* `Int` / `Bool` / `Float` targets — emit a random literal
(biased toward distinguished values: 0, 1, -1, ...).
* `AbstractionType(x: A, B)` — emit an `Abstraction`, push the
binder into scope, recurse into the body.
* Otherwise — pick a context variable whose return type's base
name matches the target (monomorphic case) or whose type is a
`forall` whose return is a TypeVar (polymorphic case). For
polymorphic vars, peel each outer `forall` and emit a
`TypeApplication` per layer, substituting the target type for
the bound type variable. Then build the application chain by
recursively generating one argument per parameter.
* Sometimes (15%) emit an `If` with a generated Bool condition.
- **Pareto front**: `Vec<(Vec<f64>, PyObject)>` of non-dominated
individuals. Insertion checks dominance both ways: rejects the new
point if anything in the front dominates it, prunes everything the
new point dominates, otherwise admits it. Lexicographic
tie-breaking picks the "winner" returned to the caller.
- **Budget enforcement**: time-bounded outer loop (`std::time::Instant`).
A separate `hard_budget` counter caps the recursive call count per
trial so polymorphic targets can't trigger unbounded recursion
through the context (a TypeVar return cycles back through
`random_var_of_type` indefinitely without it).
- **UI integration**: on each successful evaluation, calls the
optional `ui.register(term, score, elapsed, is_better)` so the
terminal UI prints candidates in real time, mirroring the other
backends.
Two entry points: `synthesize` (matches the `Synthesizer.synthesize`
ABC; returns just the best term) and `synthesize_with_front` (also
returns the full Pareto front — handy for tests and introspection).
Python glue (`aeon/synthesis/modules/rust_enum.py`):
- `RustEnumSynthesizerWrapper(Synthesizer)` subclass that delegates
to the Rust core. Registered in
`aeon/synthesis/modules/synthesizerfactory.py` under name `"rust_enum"`.
New dependency in `aeon-rs/Cargo.toml`: `rand = { features = ["small_rng"] }`.
Tests in `tests/rust_enum_synth_test.py` cover: construction,
factory wiring, minimal Int synthesis, Pareto-front accumulation
under multi-objective trade-offs, and rejection when no candidate
validates.
End-to-end demo: on `examples/synthesis/dummy.ae`
(`def fun (i:Int | i > 0) : {j:Int | j > i} = ?todo` with
`@minimize_int(-(fun 3 - fun 5))`), the synthesizer hits
`i - -31` (= `i + 31`) at fitness 2.0 within 300 ms.
All 970 tests pass (965 prior + 5 new).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… types
Before: every `forall a` in a context variable's type was always
instantiated with the *target type*. That's correct for the
arithmetic operators (`forall a. a -> a -> a` — `+`, `-`, `*`, `/`),
where the type variable appears in the return position and must
agree with the goal type. But for comparison operators
(`forall a. a -> a -> Bool` — `==`, `!=`, `<`, `<=`, `>`, `>=`),
`a` only appears in the arguments; instantiating it to `Bool`
collapses the search to `bool == bool` and never reaches
`int_expr == int_expr`. Same for `print : forall a. a -> Unit`
and the application combinator `$ : forall a, b. ...`.
The fix is a per-forall check: walk the body to its return base and
see whether it equals the bound type variable.
* **In return position** → forced: instantiate to the target.
* **Argument-only** → free: pick uniformly from a curated pool of
concrete primitives (`Int`, `Bool`, `Float`, `String`, `Set`).
The pool is the set of primitive types we already know how to
generate values for. `Tensor` and `GpuConfig` are excluded because
they're runtime-only and we have no literal-generation path for
them.
Also: added `random_string_literal` (with a small bag of
distinguished strings: empty, single chars, common words) so that
when a free type variable lands on `String`, the argument
generation has something to produce.
Concrete improvement on a Bool target (`def fun(i:Int) : Bool = ?todo`):
- Before: only emitted `True`, `False`, and `bool && bool`-shaped
terms — the `i` parameter was never used.
- After: emits things like `1.0 >= -3.14`, `Set_dif Set_empty (...)
!= Set_empty`, `False && True` — comparisons at Int/Float/Set
show up alongside the trivial Bool literals, and the search
reaches the parameter through any of the primitive comparison
axes.
All 970 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ifted
After the master merge reverted every previously-Rust-ported module to
its full Python form, this PR rebuilds the shim layer for the subset
where the Rust pyclass cleanly substitutes for master's class — no
new fields, no new variants, no `type(x) is X` checks that would
notice the swap.
Newly back on Rust:
- `aeon/utils/superscripts.py` — pure string-translation table; the
port (`aeon-rs/src/superscripts.rs`) is a string→string `pyfunction`
pair. Only consumer is `aeon.utils.name`.
- `aeon/utils/location.py` — `Location` / `FileLocation` /
`SynthesizedLocation` come from `aeon_rs`. Identical surface
(`get_start`, `get_end`, hashable, frozen).
- `aeon/utils/name.py` — `Name` and `FreshCounter` from `aeon_rs`.
`Name.pretty()`, `__str__` (with superscripted id), `__eq__`,
`__lt__`, `__hash__` all match the Python original.
- `aeon/core/multiplicity.py` — `Multiplicity` plus the singleton
triplet `M0` / `M1` / `MOmega` / `MN`, plus the `add` / `mul` /
`from_token` functions. Identity (`is`) checks continue to work
because each constant is cached at module init.
- `aeon/utils/pprint_helpers.py` — Wadler-Leijen Doc combinator. The
Rust port is a single `Doc` pyclass instead of the Python hierarchy
(`Nil` / `Text` / `Concat` / `Nest` / …), but the functional API
(`text`, `line`, `group`, `concat`, `nest`, `parens`,
`insert_between`, `nil`, `soft_line`, `hard_line`, plus the
`DEFAULT_WIDTH` / `DEFAULT_TAB_SIZE` constants) is what every caller
uses — no pattern matching on Doc variants anywhere in the tree.
Modules still in master's Python form (and likely stuck there until
the Rust ports catch up with new variants or hand-written
type-identity checks): `core/{liquid_ops, normal_form, pprint}`,
`decorators/api`, `verification/constructor_registry`,
`typechecking/{linearity, termination, qualifiers, partial_vc}`,
`prelude/prelude` (already partial), plus the big AST hierarchies
(`core/{liquid, terms, types}`, `sugar/{program, stypes}`,
`elaboration`, `sugar/{bind, desugar, lowering, equality, lifting,
substitutions}`, `typechecking/{context, entailment, typeinfer,
well_formed, liquid}`, `verification/{horn, sub, smt, helpers, vcs}`,
`utils/pprint`). Master added new variants (`LiquidLiteralUnit`,
`ImplicitRefinementHole`, `TypeVarDeclaration`) and structural
changes (typeclasses, Lean-style dictionary passing, capture-avoiding
substitution rework) that the Rust port doesn't track.
All 1230 tests pass (9 skipped, 3 xfailed).
Add the new master variant `LiquidLiteralUnit` (the lone Unit-typed liquid value — gets its own SMT sort to reject `unit == True`) as a pyclass in `aeon-rs/src/liquid.rs`. Also add `ensure_liqterm` and `is_safe_for_application` as pyfunctions matching the master Python helpers. Loosen `LiquidTerm.__new__` to accept and ignore arbitrary `(*args, **kwargs)` so Python subclasses (the Python dataclass `LiquidHornApplication` in `aeon.core.types` chief among them) can forward their own constructor arguments through `super().__new__`. `aeon/core/liquid.py` is back to a re-export shim plus the small recursive `liquid_free_vars` helper (the Rust port has its own, but this one stays in Python since `aeon.core.types` uses `isinstance(e, LiquidVar)` style traversal that's clearer to read in Python). All 1230 tests pass.
Match master's `Kind(Enum)` shape:
- `aeon-rs/src/kind.rs` collapses the old `Kind`/`BaseKind`/`StarKind`
trio into a single `Kind` pyclass with an internal `KindVariant`
enum and two cached singletons exposed as class attributes
(`Kind.BASE`, `Kind.STAR`). Backward-compatible `BaseKind()` and
`StarKind()` factory functions remain so any pre-Enum caller still
gets the canonical singleton.
- Added `crate::kind::{base, star, is_base, is_star}` helpers; updated
every internal call site in `desugar.rs`, `elaboration.rs`,
`typectx.rs`, `partial_vc.rs`, `typeinfer.rs`, `well_formed.rs`
to use them in place of `Py::new(py, (BaseKind, Kind))` and
`downcast::<BaseKind>` patterns.
Add the new Term variant introduced by master:
- `ImplicitRefinementHole` in `aeon-rs/src/terms.rs` — the placeholder
the elaborator inserts when instantiating a `RefinementPolymorphism`.
Distinct from `Hole` so synthesis-side `case Hole(...)` patterns
don't match it.
Loosen the base-class constructors so master's Python subclasses
(`LiquidHornApplication` in `core/types`, etc.) can forward
constructor arguments via `super().__new__()`:
- `LiquidTerm.__new__`, `Term.__new__`, `Type.__new__` now accept
`(*args, **kwargs)` and ignore them.
Restore `aeon/core/terms.py` as a re-export shim. All 15 Term
variants (including `ImplicitRefinementHole`) now come from
`aeon_rs`.
All 1230 tests pass.
Add the master-shaped type-walking helpers to Rust in a new module `aeon-rs/src/core_types_helpers.rs`: - `base(ty)` — strip a `RefinedType` wrapper. - `extract_parts(ty)` — `(name, base, refinement)` decomposition for refined / bare types. - `is_bare(ty)` — true iff `ty` has no user-written refinement (`LiquidHole` / `LiquidHornApplication` placeholders count as bare). - `type_free_term_vars(ty)` — free term-level Names, respecting all binder scopes and skipping the operator prelude (`ALL_OPS`). The full existential-case logic is mirrored. - `get_type_vars(ty)` — free TypeVars, stripping the ones bound by `TypePolymorphism`. - `with_binders(extra, ty)` — smart `ExistentialType` constructor that flattens nested wrappers. Plus the built-in `TypeConstructor` constants (`t_unit`, `t_bool`, `t_int`, `t_float`, `t_string`, `t_set`, `t_tensor`, `t_gpu_config`), the `builtin_core_types` list, and the `top` / `liq_true` singletons — all built once at Rust module init and exposed as module attributes, so every Python import sees the same instances. `aeon/core/types.py` becomes a thin re-export shim. All 1230 tests pass.
`aeon-rs/src/core_equality.rs` (new, ~500 LOC) implements all four
alpha-equivalence functions from master:
- `core_liquid_equality(lq1, lq2, rename_left=None)` — recursive
equality on `LiquidTerm` with bound-variable renaming.
- `core_type_equality(t1, t2, rename_left=None)` — same for `Type`,
handling `RefinedType`, `AbstractionType`, `TypePolymorphism`,
`RefinementPolymorphism`, `TypeConstructor`, `Top`.
- `canonicalize_type(ty, rename=None, counter=None)` — rebuilds a
type into a deterministic alpha-equivalence normal form by renaming
every binder to `Name("__c<N>__", 0)` from a shared pre-order
counter, threading the counter through the recursion. The result
compares equal / hashes equal iff two types are alpha-equivalent.
The Python original threaded the counter via a one-element
`list[int]`; the shim does the same so calls like
`canonicalize_type(t, counter=[0])` continue to mutate the list.
- `core_term_equality(term1, term2, rename_left=None)` — same for
`Term`, covering `Var` / `Literal` / `Application` / `Abstraction` /
`Let` / `Rec` / `If` / `TypeApplication` / `TypeAbstraction` /
`RefinementAbstraction` / `RefinementApplication`.
`aeon/core/equality.py` becomes a re-export shim.
`aeon/decorators/api.py` becomes a shim importing `metadata_update`,
`metadata_update_by_name`, and `CORE_DECORATOR_QUEUE_META_KEY` from
the Rust core. `metadata_update` was duck-typed (`fun.name` via
`getattr` instead of borrowing as Rust `Definition`) so it accepts
master's Python `Definition` shape.
All 1230 tests pass.
- `aeon/optimization/normal_form.py` — re-export `nf`, `normal_form`, `optimize` from the existing `aeon-rs/src/normal_form.rs` partial evaluator. - `aeon/core/liquid_ops.py` — re-export `liquid_prelude`, `mk_liquid_and`, `mk_liquid_or` from `aeon-rs/src/liquid_ops.rs`. - `aeon/verification/constructor_registry.py` — re-export `register_constructors`, `get_constructor_groups`, `clear_constructor_registry` from `aeon-rs/src/constructor_registry.rs`. The other typechecking modules (`qualifiers`, `linearity`, `termination`, `partial_vc`) and `aeon.verification.vcs` can't be re-shimmed yet — their Rust functions take `Py<TypingContext>` / `Py<Definition>` of the Rust pyclass shape, and master's Python TypingContext is no longer the Rust type. They'd need duck-typing work first. All 1230 tests pass.
- Abstraction.__repr__ now prints Lean-style '(fun x => body)' (#340) - Top carries a loc field (master made it a dataclass with a SynthesizedLocation default); all in-crate constructions updated - RefinementPolymorphism.__str__ drops the ' -> Bool' suffix to match master's n-ary abstract-refinement rendering (#346)
Master's new features use dataclass conveniences that the Rust pyclasses do not provide; restore them idiomatically: - typeinfer._erase_return_refinement, evaluation_pool.set_program_tail, pbt.runner._replace_tail: replace dataclasses.replace() with explicit node reconstruction (all fields, including loc/multiplicity, preserved) - lsp.navigation.iter_terms: walk children via __match_args__ (exposed by both dataclasses and the Rust pyclasses) instead of dataclasses.fields, which silently yields nothing on Rust nodes - new aeon/core/pickling.py: copyreg reducers so pickle/dill can ship core Term trees across multiprocess queues (required by the new evaluation pool from #365); Kind/Multiplicity singletons restored by name to keep identity-based matching working
- rust_enum: match the widened Synthesizer.synthesize signature (output_value parameter from #342's output-clustering work) - linearity: explicit TypeAlias for _Usage (Multiplicity now comes from the untyped Rust extension, so implicit alias detection doesn't fire) - utils/pprint: rename the InductiveDecl 'args' capture so it doesn't join with Definition's differently-typed 'args' capture - drop type-ignores that are unused against the Rust class types
synthesize() already accepted any context exposing vars(); the with-front variant still demanded the Rust TypingContext pyclass and so rejected the pipeline's Python TypingContext. Mirror the duck-typed extraction and drop the now-passing xfail markers from its tests.
The z3 crate links against the system libz3, which GitHub's ubuntu-latest runners don't ship. This PR is the first time the Rust-port branch targets master, so the full python-app workflow (which builds the extension via 'uv pip install -e .') runs here for the first time.
The 3.14 job died fetching minimal-lexical from crates.io with curl's 'Error in the HTTP2 framing layer' and fail-fast cancelled the rest of the matrix. Disable cargo's HTTP/2 multiplexing, raise its retry count, and let matrix entries finish independently.
The branch-only Vector library (kept for the skipped LLVM e2e tests) still used pre-Lean '=' definitions and backslash lambdas, breaking the docs workflow's stdlib AeonDoc build under master's parser. ':=' and 'fun x =>' now; scripts/build_stdlib_docs.py passes for all 50 modules.
e6de566 to
bc01d3b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brings the Rust-port stack (tip of #257,
claude/aeon-rs-pr10-flatten-helpers) up to date with current master — all 64 commits since the last sync (#324), through #379 (selfification). Every new master feature now runs on the Rust-backed core, and every master-side change that touched a Rust-owned module has been ported intoaeon-rs.Supersedes the stale state of #257.
Merge strategy
The branch's established pattern is merging master in (it had done so three times); a literal rebase of ~670 first-parent commits would rewrite the whole stack's history. Conflict policy:
examples/,libraries/, tests: master wins (Lean-style operators/fun x =>lambdas, dot-method stdlib, inductiveList, new test suites are authoritative).aeon/synthesis/{api,entrypoint,ge_synthesis}.py: master wins — restoresInvalidIndividualExceptionand the evaluation-pool protocol from fix(synthesis): correct eval-sandbox timeout cleanup; seed stochastic backends via AEON_SEED #365 that an earlier branch state had rolled back.synthesizerfactory.py: union — therust_enumbackend coexists with master's newsymetric/fta/afta/catabackends.aeon/core/{terms,types}.py: the Rust shims stay; master's deltas were ported into the crate (below).Master deltas ported into
aeon-rsAbstraction.__repr__prints Lean-style(fun x => body)(Migrate lambdas to lean4's syntax #340).Topcarries alocfield (master made it a dataclass;lowering.pyconstructsTop(loc=...)); all in-crate construction sites updated.RefinementPolymorphism.__str__matches the n-ary abstract-refinement rendering (feat(refinements): generalise abstract refinements to n-ary relations #346).rust_enum'ssynthesize_with_frontnow duck-types the typing context likesynthesizealready did, so it accepts the pipeline's PythonTypingContext; its previously-xfailed tests now pass and the markers are removed.Master features adapted to the Rust core
Master's new code leaned on dataclass conveniences the Rust pyclasses don't provide:
dataclasses.replace()on core nodes → explicit reconstruction (all fields incl.loc/multiplicitypreserved) intypeinfer._erase_return_refinement,evaluation_pool.set_program_tail,pbt.runner._replace_tail.lsp.navigation.iter_termswalked children viadataclasses.fields, which silently yields nothing on Rust nodes → now walks__match_args__(works for both).aeon/core/pickling.py:copyregreducers so pickle/dill can ship coreTermtrees acrossmultiprocessqueues (required by fix(synthesis): correct eval-sandbox timeout cleanup; seed stochastic backends via AEON_SEED #365's evaluation pool).Kind/Multiplicitysingletons are restored by name so identity-based matching keeps working across processes.TypeAliasinlinearity, widenedrust_enum.synthesizesignature foroutput_valuefrom SyMetric: output-distance clustering, @cluster featuriser, suitability guard #342, capture rename inutils/pprint, dropped now-unused ignores).What stays Python
The live Rust surface (core AST + Name/Kind/Location/Multiplicity, equality, liquid ops, normal form, pprint Doc combinators, constructor registry, the
rust_enumsynthesizer) is fully synced with master semantics. Master's feature work in parser/elaboration/typechecking/LSP/synthesis-backends/libraries flows through pure-Python modules that pattern-match on the Rust pyclasses, so selfification, division-by-zero checking, theaxiomkeyword, PBT, and the new synthesis backends all work on the Rust core unchanged. The crate's dormant mirrors of those modules (kept for the eventual full port) are untouched.Results
bash run_examples.sh: all examples pass, including master's new PBT/testing suites.🤖 Generated with Claude Code
https://claude.ai/code/session_01ExB9vhTQitxZxGZ4wtEQbZ
Generated by Claude Code