# v0.2 optional-feature adoption MADDENING v0.2 ships several optional surfaces a downstream project *may* adopt. The MIME v0.2 fit-up §7 first evaluated all five and deferred them; §8 reversed that for four of the five and added two larger pieces (the shared fluid-node contract and LBM multi-GPU sharding). This page records the **current state** of each item. ## Adopted ### Profiler — Perfetto export via the runner `mime/runner/server.py` now exposes MADDENING's `maddening.core.simulation.profiler` over its ZMQ REP socket — `profile` / `profile_jax_{start,stop,status}` commands. The `profile` command snapshots `gm._state` around `profile_graph` so the live run is undisturbed. `scripts/profile_experiment.py` is the thin client that requests a profile and writes the trace for [ui.perfetto.dev](https://ui.perfetto.dev). ### AWS / GCP cloud providers `scripts/launch_job.py` is a provider-agnostic wrapper over MADDENING's `CloudLauncher` (`--job [--dry-run]`); `jobs/production_h100_aws.yaml` and `jobs/production_h100_gcp.yaml` are spot-enabled provider variants of the confinement sweep, safe to use now that §5's `ResumableSweep` resumes a preempted sweep. See [`../infrastructure/cloud_launch.md`](../infrastructure/cloud_launch.md). ### `BinaryStateEncoder` — opt-in binary stream + handshake The runner publishes `ResultFrame`s as JSON by default; setting `MIME_STREAM_FORMAT=binary` (or sending a `set_stream_format` REP command) switches to compact `BinaryStateEncoder` frames. A `stream_info` REP command is the handshake by which a consumer learns the active format and, for binary, the decode schema. The MICROROBOTICA-side C++ binary decoder (the cross-repo follow-on) is implemented on the `feat/binary-resultframe-decoder` branch of MICROROBOTICA. ### `GraphManager.validate_sharding()` The §7 record framed this as a forward-looking guard (MIME shards nothing). §8 Step 5 changed that — `IBLBMFluidNode` now shards across devices (see below), so `validate_sharding()` checks a real sharded graph. ### Shared fluid-node contract — *added to §8* [`../../src/mime/nodes/environment/FLUID_NODE_CONTRACT.md`](../../src/mime/nodes/environment/FLUID_NODE_CONTRACT.md) defines the interface MIME's four fluid nodes share — `drag_force` / `drag_torque` outputs, `body_*` inputs — so a graph can swap one fluid implementation for another. LBM, Stokeslet and DefectCorrection already conformed; FVM was reconciled additively (single-body FVM now exposes the contract names alongside its per-body multi-body extension). ### LBM multi-GPU sharding — *added to §8, implemented* `IBLBMFluidNode` declares the MADDENING v0.2.1 sharded-stencil API surface — `state_fields` excludes the drag domain integrals, `domain_integral_fields` declares them so `ShardedStencilNode` `lax.psum`s them, `static_data` conditionally shards the pipe-wall mask via the new `multigpu_shard_axis` constructor parameter, and `update_padded` takes the v0.2.1 `static_padded` / `shard_info` kwargs. The MIME `maddening` pin is bumped to `>=0.2.1,<0.3`. **Multi-device execution is implemented.** `update_padded`'s sharded path collides on the halo-padded slab, streams via the slice-based `d3q19.stream_padded`, recomputes the pipe + UMR missing-link masks per slab (`bounce_back.compute_missing_mask_sharded`, halo-slice on the shard axis / periodic roll on the full axes), rebuilds the UMR body and rotation-velocity field on the slab's global coordinates (an `origin` offset on the geometry helpers + the momentum-exchange torque), and returns the drag integrals as per-slab partial sums for the wrapper to `psum`. It is **validated bit-identical** to the single-device step under jit on a 4-device CPU virtual-device mesh — distribution field and both drag integrals, one step and a 50-step rotating trajectory (`tests/verification/test_lbm_sharded_contract.py`). **Bouzidi interpolated bounce-back on the sharded path stays deferred** (it needs per-slab SDF q-value recomputation); the node guards `use_bouzidi=True` under sharding with a clear `NotImplementedError`, and simple bounce-back is sharded today. Real multi-GPU-*cluster* validation of the de Boer sweep is a separate hardware step. ## Deferred — infrastructure follow-ups ### FVM multi-GPU sharding `FVMFluidNode` is an unstructured face graph; MADDENING's `ShardedStencilNode` only handles structured per-axis grids. Sharding the FVM node needs graph-partitioning halos (METIS-style domain decomposition + irregular halo exchange) — a MADDENING v0.3 roadmap item. Cannot proceed until MADDENING provides it. ### BEM (Stokeslet) multi-GPU sharding A dense LU solve; MADDENING has no distributed dense linear algebra. The realistic path is replacing the direct solve with a shardable iterative solver (GMRES + preconditioner) — a numerical-methods change with accuracy implications and its own validation effort. ## Deferred — nothing to consume ### Surrogate primitives — `maddening/surrogates/primitives/` The package contains only an empty `__init__.py`; nothing to consume. MIME's `surrogates/cholesky_mlp.py` stands alone correctly. Revisit when MADDENING populates the package (intersects MADDENING's own deferred "decoder-zoo pull-over from MIME" item). --- *Original §7 evaluation 2026-05-22; revised by §8 close-out 2026-05-22.*