--- orphan: false --- # What's new in v0.2 ```{versionadded} v0.2 ``` v0.2 brings MIME into full alignment with **MADDENING v0.2.1** and lands the two pieces of new architecture the v1.0 stack roadmap puts on the critical path: **multi-GPU sharding of the IB-LBM fluid node** (the infrastructure unblock for the de Boer step-out replication at scale) and the **EffectModel contract pilot** (the first cut of the composable force-field-effect abstraction). It also adopts the v0.2 node-API surfaces — `halo_width()`, the `static_data` channel, compile-time edge validation — and folds in the TF32 / float32 numerical-precision fixes. ```{warning} **MADDENING floor is now `>=0.2.1,<0.3`.** The IB-LBM sharded path consumes v0.2.1's sharded-`StaticArray` slicing and `domain_integral_fields`, and v0.2.1 escalates edge shape/dtype mismatches from warnings to a hard `ExceptionGroup` of `EdgeValidationError`s at `compile()`. Build a complete `boundary_input_spec()` on custom nodes. See [Edge validation](../architecture/edge_validation.md). ``` The [`CHANGELOG.md`](https://github.com/Microrobotics-Simulation-Framework/MIME/blob/main/CHANGELOG.md) has the itemised diff. This page is the narrative summary. ## Highlights ### Multi-GPU IB-LBM sharding `IBLBMFluidNode` now runs **pencil-decomposed across devices** via MADDENING v0.2.1's `ShardedStencilNode`. The full multi-device `update_padded` is implemented (it previously raised `NotImplementedError`): collision runs on the halo-padded slab, streaming switches from the periodic `jnp.roll` to a slice-based halo read (`d3q19.stream_padded`), the pipe-wall and rotating-UMR missing-link masks are recomputed per slab (`bounce_back.compute_missing_mask_sharded`), the UMR body and its rotation-velocity field are rebuilt on each slab's *global* coordinate range (an `origin` offset threaded through the geometry helpers and the momentum-exchange torque), and `drag_force` / `drag_torque` are declared `domain_integral_fields` so the wrapper `lax.psum`s the per-slab partial sums across the mesh. Construct the node with `multigpu_shard_axis=` and wrap it in `ShardedStencilNode(node, mesh, axis_map={"devices": }, boundary="periodic")`. The pipe wall is the only sharded `StaticArray` (it is axis-aligned `(nx,ny,nz)`); the missing-link masks are recomputed per slab rather than carried as a transposed `(19, …)` static, which the wrapper's shard-axis check rejects. **Validated bit-identical** to the single-device step under jit on a 4-device CPU virtual-device mesh — the distribution field and both psum'd drag integrals, for a single step and a 50-step rotating trajectory (`tests/verification/test_lbm_sharded_contract.py`, marked `slow`; run with `JAX_PLATFORMS=cpu XLA_FLAGS="--xla_force_host_platform_device_count=4"`). ```{note} Two narrow pieces stay deferred: **Bouzidi interpolated bounce-back on the sharded path** (needs per-slab SDF q-value recomputation; the node guards `use_bouzidi=True` under sharding with a clear `NotImplementedError`), and **real multi-GPU-cluster validation** of the de Boer sweep (the node-level bit-compat on virtual devices is the infrastructure proof; cluster runs are a separate hardware step). ``` ### EffectModel contract pilot (`mime.effects`) A new package introduces the **EffectModel** abstraction — a common builder pattern for environment effects that deliver force/torque to a body (per `ADR-2026-EFFECT-MODEL`). v0.2 ships the *pilot*: the Protocol surface, the registry, the `Experiment` composition + validation, and the `HydrodynamicModel` family. See [EffectModel contract](../architecture/effect_model_contract.md). * **Protocol surface** — `EffectModel` (`coupling_ports` / `applicable_regime` / `build`), `SourcedEffectModel[S]` (covariant source TypeVar, for the v0.3 magnetic family), `EffectHandle`, `CouplingSpec`, and a polymorphic `Regime` ABC with `HydrodynamicRegime`. * **Registry** — `@register_effect("Family.Backend")` / `list_registered_effects()` / `get_effect()` (decorator-based and auditable; no importlib auto-discovery). * **`Experiment`** — `attach()` / `couple()` / `build()` with a **six-pass validation contract**, each pass raising a specific typed error (`CouplingError`, `PortTypeMismatchError`, `BodyPropertyMissing`, `MediumPropertyMissing`, `EffectBuildError`), plus **load-time** MIME-version compatibility validation (`IncompatibleMimeVersionError`) and the version-and-provenance metadata fields (`asset_paths`/`asset_hashes`, `benchmark_refs`, `citation`). * **`HydrodynamicModel`** — `LBM` / `FVM` / `Stokeslet` / `DefectCorrection` backends that adapt the existing fluid nodes over the shared drag contract, so one backend can be swapped for another across the same graph edges. The LBM backend carries the lattice→SI edge transforms; the others are SI. * **Runnable concept-proof** — `test_effectmodel_stokes_drag_swap` composes a kinematic sphere + a backend through `Experiment` (driving it via composed external inputs), runs it, and reads the drag: the Stokeslet backend reproduces analytical Stokes drag (`6πμaV`) to ≈0.4%, and swapping a single `attach()` line runs the FVM Navier–Stokes solver across the identical body/edges. (Free-space drag — the confined microrobot experiments need the v0.3 magnetic family + coupling composition.) * **Drag sign normalization** — the raw fluid nodes disagree on sign (IBLBM and standalone Stokeslet report the reaction `+R·motion`, which is anti-dissipative if a body adds it — a de-Boer UMR diverges without its `omega_max` clamp; FVM reports the force on the body). The `HydrodynamicModel` adapter normalizes every backend to the contract sign (**force on the body**) via `native_drag_sign`, so a swapped backend always delivers dissipative drag. `Experiment.add_external_input` composes graph-external inputs (e.g. a kinematic body's prescribed velocity). ```{note} The `MagneticModel` family, the `SourceInputProvider` sub-abstraction, the cross-effect coupling *implementation* behind the ports, and full `make_experiment(params) -> Experiment` migration of the de Boer / de Jongh experiments are deferred to **v0.3** — they need the magnetic family, since those experiments' actuation chains are not yet EffectModels. ``` ### MADDENING v0.2 node-API alignment * `requires_halo` (boolean property) is superseded by `halo_width()` (a method returning `dict[int, int]`). v0.2 derives the old property for backward compatibility, but subclassing a node that still defines `requires_halo` emits a `FutureWarning`; v0.3 escalates this to a `MigrationError`. See [Node API migration](../architecture/node_api_migration.md). * `static_data` is the channel for **large non-evolving arrays** (meshes, BEM LU factors, MLP weights). It keeps them out of the state pytree and out of every checkpoint, and — when declared `replication="shard"` — is what the sharded LBM path slices per device. See [Static data usage](../architecture/static_data_usage.md). * **Shared fluid-node contract** — `FVMFluidNode`, `IBLBMFluidNode`, `StokesletFluidNode` and `DefectCorrectionFluidNode` are graph-interchangeable for the single-immersed-body case (`drag_force` / `drag_torque` outputs, `body_*` inputs). FVM was reconciled additively (single-body FVM now exposes the contract names alongside its multi-body `force_` / `torque_` extension). The contract lives at `src/mime/nodes/environment/FLUID_NODE_CONTRACT.md`; the EffectModel `HydrodynamicModel` family builds on it. ### Compile-time edge validation `GraphManager.compile()` checks every edge's shape, dtype and units against the target's `BoundaryInputSpec`. MADDENING v0.2.1 raises an `ExceptionGroup` of `ShapeMismatchError` / `DtypeMismatchError` on shape/dtype mismatches; unit mismatches stay advisory (`UnitMismatchWarning`). MIME's experiment graphs are pinned clean by a test. See [Edge validation](../architecture/edge_validation.md). ### Preempt-resilient parameter sweeps `mime.data.sweep_resume.ResumableSweep` wraps MADDENING v0.2's checkpoint API (`save_state_with_manifest` / `load_state_with_manifest`, manifest-hash integrity) into a sweep facility: items finished in a prior run are skipped, each new item is checkpointed immediately, and a snapshot directory mirrors state to durable storage so cloud-spot preemptions resume cleanly. `scripts/run_confinement_sweep.py` is the worked example. See [Preempt / resume](../preempt_resume.md). ### Runner — profiler, opt-in binary stream, declarative actor poses The experiment runner (`mime.runner.server`) gains REP commands: * `profile` / `profile_jax_{start,stop,status}` — return a Perfetto-format trace from MADDENING's profiler; `scripts/profile_experiment.py` is the thin client. * `stream_info` / `set_stream_format` — frames publish as JSON by default; compact `BinaryStateEncoder` frames are opt-in via `MIME_STREAM_FORMAT=binary` (or `set_stream_format`), with `stream_info` as the handshake by which a consumer learns the active format and decode schema. The runner also no longer hardcodes pose recipes for `arm_link_`, `motor_rotor` and `magnet`. An actor whose pose lives at a non-default location declares it in `experiment.yaml`: ```yaml scene: actors: motor_rotor: pose_from: { node: motor, field: rotor_pose_world } arm_link_3: pose_from: { node: arm, field: link_poses_world, index: 3 } body: state_fields: [position, orientation] # unchanged generic path ``` ### Cloud — AWS / GCP `scripts/launch_job.py` is the provider-agnostic launcher; `jobs/production_h100_aws.yaml` and `jobs/production_h100_gcp.yaml` are spot-enabled templates (safe now that `ResumableSweep` resumes preempted sweeps). See [Launching cloud jobs](../infrastructure/cloud_launch.md). ### Numerical & precision fixes * **TF32** — GPU TF32 was silently corrupting low-magnitude float32 matmuls (e.g. LBM momentum-moment near-cancellations). v0.2 forces full precision in the LBM moment transforms and the FVM pressure solver; custom GPU code should use `precision="highest"` for precision-sensitive matmuls. See [GPU precision](../user_guide/gpu_precision.md). * **LBM mass conservation** — fixed a BGK-collision rounding bias that drifted total density over long runs. * **Oberbeck-Stechert drag coefficients** — `oberbeck_stechert_coefficients` was reformulated for float32 stability. The prolate-spheroid denominators subtract a leading `±2e` that cancels against `2·atanh(e)`, destroying single-precision accuracy near `e = 0` (`C₁ = 1.037` instead of `1.000` at `e = 0.01`). The cancellation-free form via `g = atanh(e) − e` (a Maclaurin series for small `e`, direct `atanh` otherwise) is algebraically identical and validated to ≤ 2.6 × 10⁻⁷ against a float64 reference across `e ∈ [0.001, 0.99]`. See [Rigid body](../algorithm_guide/nodes/rigid_body.md). ### Test infrastructure — scoped `jax_enable_x64` Double precision is now opt-in per test via a `@pytest.mark.x64` marker plus an autouse `conftest.py` fixture that enables x64 for marked tests and restores the prior value on teardown. This replaces module-level `jax.config.update("jax_enable_x64", True)` calls that ran at *collection* time and leaked x64 into the whole session — an order-dependence hazard under `pytest-xdist` that also masked genuine float32 defects (the Oberbeck cancellation above was one). `tests/test_x64_isolation.py` guards against regressions. ## Migration playbook A custom `SimulationNode` written against MIME v0.1 needs at minimum: 1. **Replace `requires_halo` with `halo_width()`.** Pointwise nodes drop the old property and inherit the `{}` default; stencil nodes return a per-axis halo dict. See [the migration guide](../architecture/node_api_migration.md). 2. **(Optional) Move large non-evolving arrays to `static_data`.** Wrap them in `StaticArray(value)` and override the `static_data` property. For a shardable stencil node, declare the spatial-axis-aligned arrays `replication="shard"`. See [Static data usage](../architecture/static_data_usage.md). 3. **Declare a complete `boundary_input_spec()`** — every input gets a concrete shape, dtype, and `expected_units` tag. Edge validation now *errors* on shape/dtype mismatches at `compile()`. 4. **Bump the `maddening` pin** to `>=0.2.1,<0.3`. 5. **Migrate experiment configs with composite actors.** Any actor whose name does not match a graph node (`arm_link_`, `motor_rotor`, `magnet`, …) needs an explicit `pose_from: { node, field, index? }` block in `experiment.yaml`; the runner no longer hardcodes these recipes. See the migrated `experiments/ar4_helical_drive/experiment.yaml`. 6. **(Test authors) Replace module-level `jax.config.update("jax_enable_x64", True)`** with `pytestmark = pytest.mark.x64` (or per-test `@pytest.mark.x64`). Tests that silently relied on a leaked x64 will fail in isolation — give them the marker, or make the assertion float32-robust. ## New optional dependencies | Pulls in | For | |---|---| | `exceptiongroup; python_version < "3.11"` (transitive via `maddening>=0.2.1`) | The 3.10 backport of `ExceptionGroup` used by `compile()`'s edge-validation raise. | ## What's still in flight Tracked before/after the `v0.2.0` tag: * **Bouzidi IBB on the sharded LBM path** — per-slab SDF q-value recomputation (simple bounce-back is sharded today). * **FVM and BEM sharding** — blocked on MADDENING v0.3 infrastructure (graph-partitioning halos / a distributed sparse iterative solver). See [Optional v0.2 features](../architecture/v0.2_optional_features.md). * **EffectModel v0.3 scope** — the `MagneticModel` family, `SourceInputProvider`, cross-effect coupling implementation, and the full de Boer / de Jongh `make_experiment` migration. * **Folding `feat/v0.2-fitup` into `main` and tagging `v0.2.0`.**