Pre-T2.6 Gate: Fin Resolution and Stability#

Fin resolution study (DONE)#

All 4 tested resolutions (64^3 through 192^3) produce distinguishable fins with 3 angular bands per z-slice (matching 3 fins per set at 120 degree spacing).

Resolution	Circ. arc (lu)	Radial extent (lu)	Angular bands
64^3	1.4	2.5	3
96^3	2.0	5.5	3
128^3	2.6	6.0	3
192^3	4.1	10.0	3

Note: these are corrected fin geometry values (MIME-ANO-003 fix applied — fin_thickness = 0.15mm used as circumferential blade thickness per §VI.F p.15).

Minimum viable resolution: 128^3 (circ. arc = 2.6 lu — marginal but usable with Bouzidi IBB). Target resolution for T2.6: 192^3 (circ. arc = 4.1 lu — well-resolved).

Two-pass bounce-back architecture (CONFIRMED)#

Single-pass BB with combined wall velocity for pipe + UMR causes compressibility instability (Ma > 0.1 at pipe wall radius). The corrected architecture uses two sequential BB passes with disjoint missing masks:

Pass 1: Pipe wall (stationary) — apply_bounce_back(f_post, f_pre, pipe_missing, solid, wall_velocity=None)
Pass 2: UMR (rotating) — apply_bounce_back(f, f_pre, umr_missing, solid, wall_velocity=omega_x_r)

Both passes use the same f_pre (from lbm_step_split). The second pass receives the output of the first as its f_post_stream argument. The solid_mask parameter is vestigial in both BB functions (unused in computation, driven by missing_mask).

Missing masks are disjoint: pipe wall boundary links and UMR boundary links never share a node (minimum gap = 19 lu at worst-case confinement ratio 0.40).

For Bouzidi IBB: each pass gets its own q-values computed from its own SDF. Pipe wall q-values can use analytical compute_q_values_cylinder. UMR q-values use compute_q_values_sdf with umr_sdf.

Union SDF (CONFIRMED — optional with two-pass)#

With two-pass BB, each pass uses its own SDF for q-value computation. A union SDF (min(pipe_sdf, umr_sdf)) is not required for the two-pass architecture but would be needed for a single-pass variant.

No pipe wall SDF exists in the codebase. A trivial cylinder SDF should be added to helix_geometry.py if single-pass or combined q-value computation is needed later: pipe_sdf(pts) = R_vessel - sqrt(dx^2 + dy^2) (positive inside pipe = fluid, negative outside = solid wall).

Mach number guard (CONFIRMED)#

Constraint: omega * R_fin_lu * sqrt(3) < 0.1 (Ma < 0.1 at fin tips).

Resolution	R_fin (lu)	Max safe omega	Period (steps)
64^3	9.7	0.00596	1,054
128^3	19.3	0.00299	2,101
192^3	29.0	0.00199	3,156

Target Ma = 0.05 (half of limit for safety margin):

Resolution	Safe omega (Ma=0.05)	Period (steps)
64^3	0.00299	2,104
128^3	0.00149	4,209
192^3	0.00100	6,283

Guard implemented as an assertion at sweep initialisation (checked once, not per-step).

128^3 rotating stability check (DONE — PASS, 2026-03-23)#

Setup: 128x128x128, confinement ratio 0.30, two-pass BB (pipe static + UMR rotating), omega = 0.00149 rad/step (Ma = 0.05), tau = 0.8, simple BB, 1000 steps.

Results:

NaN: False
Inf: False
u_max: 0.027 lu (< 0.05 threshold)
Ma_max: 0.048 (< 0.1 threshold)
Density conservation: 0.0037% (< 0.01% threshold)
Torque sign: Correct (positive — body pumps momentum into fluid)
Step time: 0.98 s/step on RTX 2060 GPU

Convergence rate (measured at 64^3, 2026-03-23)#

Confinement ratio 0.30, two-pass BB, omega = 0.003, tau = 0.8, Ma = 0.05. Converged at step 4400 (~2.1 rotation periods), rel_change = 0.73%. Convergence criterion: 2% relative change in mean drag torque between consecutive rotation periods.

Convergence in rotation periods is resolution-independent (same physics). At 192^3 with period = 6,283 steps: expected convergence at ~13,000 steps.

Cloud rehearsal (DONE — PASS, 2026-03-23)#

Setup: A100 SXM 80GB on RunPod (US), 192^3, confinement ratio 0.30, two-pass BB, omega = 0.001 (Ma = 0.05), simple BB, 500 steps.

Results:

All 6 gates PASS
NaN: False, Inf: False
u_max: 0.028 lu (< 0.05)
Density conservation: 0.001% (< 0.01%)
Torque sign: Correct
Step time: 0.058 s/step on A100 SXM 80GB

Issues fixed during rehearsal:

GPU type: A100-80GB (PCIe) → A100-80GB-SXM (SXM)
cuDNN: Docker image CUDA 12.2 incompatible with host driver 570 (CUDA 12.8). Fixed by adding pip3 install --upgrade 'jax[cuda12]' as first setup step.
SkyPilot lifecycle: stream_and_get returns at job submission, not completion. Fixed with SSH polling in launch script.
Git hash: .git/ not synced to cloud. Fixed by writing hash to file pre-sync.

GPU choice (REVISED after rehearsal)#

A100 SXM at $1.49/hr (RunPod). Revised rationale:

Measured step time at 192^3: 0.058 s/step — 17x faster than RTX 2060 extrapolation
The H100 SXM advantage (1.68x bandwidth) gives ~0.035 s/step — only 0.023s faster
At these step times, the H100 premium ($2.69 vs $1.49/hr) costs more than it saves
A100 is cheaper for this job: $1.21 vs $1.31 on H100

Resolution (DECIDED)#

192^3. Fin circumferential arc = 4.1 lu (well-resolved).

Revised cost estimate (from measured A100 SXM timing)#

Scenario	Steps	A100 SXM step time	Time per ratio	4 ratios	Cost
Optimistic (1.5 periods)	9,400	0.058s	9 min	36 min	$0.89
Expected (2 periods)	12,600	0.058s	12 min	49 min	$1.21
Conservative (3 periods)	18,800	0.058s	18 min	73 min	$1.81

Budget: $50.00. Expected cost: $1.21. Reserve: $48.79. Massive headroom — enough for multiple re-runs, Track C collection, orientation repeats, held-out test points, and future resolution escalation if needed.

Decision: PROCEED with T2.6#

All pre-launch gate checks pass. Architecture confirmed. Cloud rehearsal PASS. Ready for A100 SXM production launch.