What’s new in v0.2#
Added in version v0.2.
v0.2 brings MIME into full alignment with MADDENING v0.2.1 and lands the
two pieces of new architecture the v1.0 stack roadmap puts on the critical
path: multi-GPU sharding of the IB-LBM fluid node (the infrastructure
unblock for the de Boer step-out replication at scale) and the EffectModel
contract pilot (the first cut of the composable force-field-effect
abstraction). It also adopts the v0.2 node-API surfaces — halo_width(),
the static_data channel, compile-time edge validation — and folds in the
TF32 / float32 numerical-precision fixes.
Warning
MADDENING floor is now >=0.2.1,<0.3. The IB-LBM sharded path consumes
v0.2.1’s sharded-StaticArray slicing and domain_integral_fields, and
v0.2.1 escalates edge shape/dtype mismatches from warnings to a hard
ExceptionGroup of EdgeValidationErrors at compile(). Build a complete
boundary_input_spec() on custom nodes. See
Edge validation.
The CHANGELOG.md
has the itemised diff. This page is the narrative summary.
Highlights#
Multi-GPU IB-LBM sharding#
IBLBMFluidNode now runs pencil-decomposed across devices via MADDENING
v0.2.1’s ShardedStencilNode. The full multi-device update_padded is
implemented (it previously raised NotImplementedError): collision runs on
the halo-padded slab, streaming switches from the periodic jnp.roll to a
slice-based halo read (d3q19.stream_padded), the pipe-wall and rotating-UMR
missing-link masks are recomputed per slab (bounce_back.compute_missing_mask_sharded),
the UMR body and its rotation-velocity field are rebuilt on each slab’s
global coordinate range (an origin offset threaded through the geometry
helpers and the momentum-exchange torque), and drag_force / drag_torque
are declared domain_integral_fields so the wrapper lax.psums the per-slab
partial sums across the mesh.
Construct the node with multigpu_shard_axis=<spatial axis> and wrap it in
ShardedStencilNode(node, mesh, axis_map={"devices": <axis>}, boundary="periodic").
The pipe wall is the only sharded StaticArray (it is axis-aligned (nx,ny,nz));
the missing-link masks are recomputed per slab rather than carried as a
transposed (19, …) static, which the wrapper’s shard-axis check rejects.
Validated bit-identical to the single-device step under jit on a
4-device CPU virtual-device mesh — the distribution field and both psum’d
drag integrals, for a single step and a 50-step rotating trajectory
(tests/verification/test_lbm_sharded_contract.py, marked slow; run with
JAX_PLATFORMS=cpu XLA_FLAGS="--xla_force_host_platform_device_count=4").
Note
Two narrow pieces stay deferred: Bouzidi interpolated bounce-back on the
sharded path (needs per-slab SDF q-value recomputation; the node guards
use_bouzidi=True under sharding with a clear NotImplementedError), and
real multi-GPU-cluster validation of the de Boer sweep (the node-level
bit-compat on virtual devices is the infrastructure proof; cluster runs are a
separate hardware step).
EffectModel contract pilot (mime.effects)#
A new package introduces the EffectModel abstraction — a common builder
pattern for environment effects that deliver force/torque to a body
(per ADR-2026-EFFECT-MODEL). v0.2 ships the pilot: the Protocol surface,
the registry, the Experiment composition + validation, and the
HydrodynamicModel family. See
EffectModel contract.
Protocol surface —
EffectModel(coupling_ports/applicable_regime/build),SourcedEffectModel[S](covariant source TypeVar, for the v0.3 magnetic family),EffectHandle,CouplingSpec, and a polymorphicRegimeABC withHydrodynamicRegime.Registry —
@register_effect("Family.Backend")/list_registered_effects()/get_effect()(decorator-based and auditable; no importlib auto-discovery).Experiment—attach()/couple()/build()with a six-pass validation contract, each pass raising a specific typed error (CouplingError,PortTypeMismatchError,BodyPropertyMissing,MediumPropertyMissing,EffectBuildError), plus load-time MIME-version compatibility validation (IncompatibleMimeVersionError) and the version-and-provenance metadata fields (asset_paths/asset_hashes,benchmark_refs,citation).HydrodynamicModel—LBM/FVM/Stokeslet/DefectCorrectionbackends that adapt the existing fluid nodes over the shared drag contract, so one backend can be swapped for another across the same graph edges. The LBM backend carries the lattice→SI edge transforms; the others are SI.Runnable concept-proof —
test_effectmodel_stokes_drag_swapcomposes a kinematic sphere + a backend throughExperiment(driving it via composed external inputs), runs it, and reads the drag: the Stokeslet backend reproduces analytical Stokes drag (6πμaV) to ≈0.4%, and swapping a singleattach()line runs the FVM Navier–Stokes solver across the identical body/edges. (Free-space drag — the confined microrobot experiments need the v0.3 magnetic family + coupling composition.)Drag sign normalization — the raw fluid nodes disagree on sign (IBLBM and standalone Stokeslet report the reaction
+R·motion, which is anti-dissipative if a body adds it — a de-Boer UMR diverges without itsomega_maxclamp; FVM reports the force on the body). TheHydrodynamicModeladapter normalizes every backend to the contract sign (force on the body) vianative_drag_sign, so a swapped backend always delivers dissipative drag.Experiment.add_external_inputcomposes graph-external inputs (e.g. a kinematic body’s prescribed velocity).
Note
The MagneticModel family, the SourceInputProvider sub-abstraction, the
cross-effect coupling implementation behind the ports, and full
make_experiment(params) -> Experiment migration of the de Boer / de Jongh
experiments are deferred to v0.3 — they need the magnetic family, since
those experiments’ actuation chains are not yet EffectModels.
MADDENING v0.2 node-API alignment#
requires_halo(boolean property) is superseded byhalo_width()(a method returningdict[int, int]). v0.2 derives the old property for backward compatibility, but subclassing a node that still definesrequires_haloemits aFutureWarning; v0.3 escalates this to aMigrationError. See Node API migration.static_datais the channel for large non-evolving arrays (meshes, BEM LU factors, MLP weights). It keeps them out of the state pytree and out of every checkpoint, and — when declaredreplication="shard"— is what the sharded LBM path slices per device. See Static data usage.Shared fluid-node contract —
FVMFluidNode,IBLBMFluidNode,StokesletFluidNodeandDefectCorrectionFluidNodeare graph-interchangeable for the single-immersed-body case (drag_force/drag_torqueoutputs,body_*inputs). FVM was reconciled additively (single-body FVM now exposes the contract names alongside its multi-bodyforce_<name>/torque_<name>extension). The contract lives atsrc/mime/nodes/environment/FLUID_NODE_CONTRACT.md; the EffectModelHydrodynamicModelfamily builds on it.
Compile-time edge validation#
GraphManager.compile() checks every edge’s shape, dtype and units against
the target’s BoundaryInputSpec. MADDENING v0.2.1 raises an ExceptionGroup
of ShapeMismatchError / DtypeMismatchError on shape/dtype mismatches;
unit mismatches stay advisory (UnitMismatchWarning). MIME’s experiment
graphs are pinned clean by a test. See
Edge validation.
Preempt-resilient parameter sweeps#
mime.data.sweep_resume.ResumableSweep wraps MADDENING v0.2’s checkpoint API
(save_state_with_manifest / load_state_with_manifest, manifest-hash
integrity) into a sweep facility: items finished in a prior run are skipped,
each new item is checkpointed immediately, and a snapshot directory mirrors
state to durable storage so cloud-spot preemptions resume cleanly.
scripts/run_confinement_sweep.py is the worked example. See
Preempt / resume.
Runner — profiler, opt-in binary stream, declarative actor poses#
The experiment runner (mime.runner.server) gains REP commands:
profile/profile_jax_{start,stop,status}— return a Perfetto-format trace from MADDENING’s profiler;scripts/profile_experiment.pyis the thin client.stream_info/set_stream_format— frames publish as JSON by default; compactBinaryStateEncoderframes are opt-in viaMIME_STREAM_FORMAT=binary(orset_stream_format), withstream_infoas the handshake by which a consumer learns the active format and decode schema.
The runner also no longer hardcodes pose recipes for arm_link_<N>,
motor_rotor and magnet. An actor whose pose lives at a non-default
location declares it in experiment.yaml:
scene:
actors:
motor_rotor:
pose_from: { node: motor, field: rotor_pose_world }
arm_link_3:
pose_from: { node: arm, field: link_poses_world, index: 3 }
body:
state_fields: [position, orientation] # unchanged generic path
Cloud — AWS / GCP#
scripts/launch_job.py is the provider-agnostic launcher;
jobs/production_h100_aws.yaml and jobs/production_h100_gcp.yaml are
spot-enabled templates (safe now that ResumableSweep resumes preempted
sweeps). See Launching cloud jobs.
Numerical & precision fixes#
TF32 — GPU TF32 was silently corrupting low-magnitude float32 matmuls (e.g. LBM momentum-moment near-cancellations). v0.2 forces full precision in the LBM moment transforms and the FVM pressure solver; custom GPU code should use
precision="highest"for precision-sensitive matmuls. See GPU precision.LBM mass conservation — fixed a BGK-collision rounding bias that drifted total density over long runs.
Oberbeck-Stechert drag coefficients —
oberbeck_stechert_coefficientswas reformulated for float32 stability. The prolate-spheroid denominators subtract a leading±2ethat cancels against2·atanh(e), destroying single-precision accuracy neare = 0(C₁ = 1.037instead of1.000ate = 0.01). The cancellation-free form viag = atanh(e) − e(a Maclaurin series for smalle, directatanhotherwise) is algebraically identical and validated to ≤ 2.6 × 10⁻⁷ against a float64 reference acrosse ∈ [0.001, 0.99]. See Rigid body.
Test infrastructure — scoped jax_enable_x64#
Double precision is now opt-in per test via a @pytest.mark.x64 marker plus
an autouse conftest.py fixture that enables x64 for marked tests and
restores the prior value on teardown. This replaces module-level
jax.config.update("jax_enable_x64", True) calls that ran at collection
time and leaked x64 into the whole session — an order-dependence hazard under
pytest-xdist that also masked genuine float32 defects (the Oberbeck
cancellation above was one). tests/test_x64_isolation.py guards against
regressions.
Migration playbook#
A custom SimulationNode written against MIME v0.1 needs at minimum:
Replace
requires_halowithhalo_width(). Pointwise nodes drop the old property and inherit the{}default; stencil nodes return a per-axis halo dict. See the migration guide.(Optional) Move large non-evolving arrays to
static_data. Wrap them inStaticArray(value)and override thestatic_dataproperty. For a shardable stencil node, declare the spatial-axis-aligned arraysreplication="shard". See Static data usage.Declare a complete
boundary_input_spec()— every input gets a concrete shape, dtype, andexpected_unitstag. Edge validation now errors on shape/dtype mismatches atcompile().Bump the
maddeningpin to>=0.2.1,<0.3.Migrate experiment configs with composite actors. Any actor whose name does not match a graph node (
arm_link_<N>,motor_rotor,magnet, …) needs an explicitpose_from: { node, field, index? }block inexperiment.yaml; the runner no longer hardcodes these recipes. See the migratedexperiments/ar4_helical_drive/experiment.yaml.(Test authors) Replace module-level
jax.config.update("jax_enable_x64", True)withpytestmark = pytest.mark.x64(or per-test@pytest.mark.x64). Tests that silently relied on a leaked x64 will fail in isolation — give them the marker, or make the assertion float32-robust.
New optional dependencies#
Pulls in |
For |
|---|---|
|
The 3.10 backport of |
What’s still in flight#
Tracked before/after the v0.2.0 tag:
Bouzidi IBB on the sharded LBM path — per-slab SDF q-value recomputation (simple bounce-back is sharded today).
FVM and BEM sharding — blocked on MADDENING v0.3 infrastructure (graph-partitioning halos / a distributed sparse iterative solver). See Optional v0.2 features.
EffectModel v0.3 scope — the
MagneticModelfamily,SourceInputProvider, cross-effect coupling implementation, and the full de Boer / de Jonghmake_experimentmigration.Folding
feat/v0.2-fitupintomainand taggingv0.2.0.