Pre-T2.6 Gate: Fin Resolution and Stability#

Fin resolution study (DONE)#

All 4 tested resolutions (64^3 through 192^3) produce distinguishable fins with 3 angular bands per z-slice (matching 3 fins per set at 120 degree spacing).

Resolution

Circ. arc (lu)

Radial extent (lu)

Angular bands

64^3

1.4

2.5

3

96^3

2.0

5.5

3

128^3

2.6

6.0

3

192^3

4.1

10.0

3

Note: these are corrected fin geometry values (MIME-ANO-003 fix applied — fin_thickness = 0.15mm used as circumferential blade thickness per §VI.F p.15).

Minimum viable resolution: 128^3 (circ. arc = 2.6 lu — marginal but usable with Bouzidi IBB). Target resolution for T2.6: 192^3 (circ. arc = 4.1 lu — well-resolved).

Two-pass bounce-back architecture (CONFIRMED)#

Single-pass BB with combined wall velocity for pipe + UMR causes compressibility instability (Ma > 0.1 at pipe wall radius). The corrected architecture uses two sequential BB passes with disjoint missing masks:

  1. Pass 1: Pipe wall (stationary) — apply_bounce_back(f_post, f_pre, pipe_missing, solid, wall_velocity=None)

  2. Pass 2: UMR (rotating) — apply_bounce_back(f, f_pre, umr_missing, solid, wall_velocity=omega_x_r)

Both passes use the same f_pre (from lbm_step_split). The second pass receives the output of the first as its f_post_stream argument. The solid_mask parameter is vestigial in both BB functions (unused in computation, driven by missing_mask).

Missing masks are disjoint: pipe wall boundary links and UMR boundary links never share a node (minimum gap = 19 lu at worst-case confinement ratio 0.40).

For Bouzidi IBB: each pass gets its own q-values computed from its own SDF. Pipe wall q-values can use analytical compute_q_values_cylinder. UMR q-values use compute_q_values_sdf with umr_sdf.

Union SDF (CONFIRMED — optional with two-pass)#

With two-pass BB, each pass uses its own SDF for q-value computation. A union SDF (min(pipe_sdf, umr_sdf)) is not required for the two-pass architecture but would be needed for a single-pass variant.

No pipe wall SDF exists in the codebase. A trivial cylinder SDF should be added to helix_geometry.py if single-pass or combined q-value computation is needed later: pipe_sdf(pts) = R_vessel - sqrt(dx^2 + dy^2) (positive inside pipe = fluid, negative outside = solid wall).

Mach number guard (CONFIRMED)#

Constraint: omega * R_fin_lu * sqrt(3) < 0.1 (Ma < 0.1 at fin tips).

Resolution

R_fin (lu)

Max safe omega

Period (steps)

64^3

9.7

0.00596

1,054

128^3

19.3

0.00299

2,101

192^3

29.0

0.00199

3,156

Target Ma = 0.05 (half of limit for safety margin):

Resolution

Safe omega (Ma=0.05)

Period (steps)

64^3

0.00299

2,104

128^3

0.00149

4,209

192^3

0.00100

6,283

Guard implemented as an assertion at sweep initialisation (checked once, not per-step).

128^3 rotating stability check (DONE — PASS, 2026-03-23)#

Setup: 128x128x128, confinement ratio 0.30, two-pass BB (pipe static + UMR rotating), omega = 0.00149 rad/step (Ma = 0.05), tau = 0.8, simple BB, 1000 steps.

Results:

  • NaN: False

  • Inf: False

  • u_max: 0.027 lu (< 0.05 threshold)

  • Ma_max: 0.048 (< 0.1 threshold)

  • Density conservation: 0.0037% (< 0.01% threshold)

  • Torque sign: Correct (positive — body pumps momentum into fluid)

  • Step time: 0.98 s/step on RTX 2060 GPU

Convergence rate (measured at 64^3, 2026-03-23)#

Confinement ratio 0.30, two-pass BB, omega = 0.003, tau = 0.8, Ma = 0.05. Converged at step 4400 (~2.1 rotation periods), rel_change = 0.73%. Convergence criterion: 2% relative change in mean drag torque between consecutive rotation periods.

Convergence in rotation periods is resolution-independent (same physics). At 192^3 with period = 6,283 steps: expected convergence at ~13,000 steps.

Cloud rehearsal (DONE — PASS, 2026-03-23)#

Setup: A100 SXM 80GB on RunPod (US), 192^3, confinement ratio 0.30, two-pass BB, omega = 0.001 (Ma = 0.05), simple BB, 500 steps.

Results:

  • All 6 gates PASS

  • NaN: False, Inf: False

  • u_max: 0.028 lu (< 0.05)

  • Density conservation: 0.001% (< 0.01%)

  • Torque sign: Correct

  • Step time: 0.058 s/step on A100 SXM 80GB

Issues fixed during rehearsal:

  1. GPU type: A100-80GB (PCIe) → A100-80GB-SXM (SXM)

  2. cuDNN: Docker image CUDA 12.2 incompatible with host driver 570 (CUDA 12.8). Fixed by adding pip3 install --upgrade 'jax[cuda12]' as first setup step.

  3. SkyPilot lifecycle: stream_and_get returns at job submission, not completion. Fixed with SSH polling in launch script.

  4. Git hash: .git/ not synced to cloud. Fixed by writing hash to file pre-sync.

GPU choice (REVISED after rehearsal)#

A100 SXM at $1.49/hr (RunPod). Revised rationale:

  • Measured step time at 192^3: 0.058 s/step — 17x faster than RTX 2060 extrapolation

  • The H100 SXM advantage (1.68x bandwidth) gives ~0.035 s/step — only 0.023s faster

  • At these step times, the H100 premium ($2.69 vs $1.49/hr) costs more than it saves

  • A100 is cheaper for this job: $1.21 vs $1.31 on H100

Resolution (DECIDED)#

192^3. Fin circumferential arc = 4.1 lu (well-resolved).

Revised cost estimate (from measured A100 SXM timing)#

Scenario

Steps

A100 SXM step time

Time per ratio

4 ratios

Cost

Optimistic (1.5 periods)

9,400

0.058s

9 min

36 min

$0.89

Expected (2 periods)

12,600

0.058s

12 min

49 min

$1.21

Conservative (3 periods)

18,800

0.058s

18 min

73 min

$1.81

Budget: $50.00. Expected cost: $1.21. Reserve: $48.79. Massive headroom — enough for multiple re-runs, Track C collection, orientation repeats, held-out test points, and future resolution escalation if needed.

Decision: PROCEED with T2.6#

All pre-launch gate checks pass. Architecture confirmed. Cloud rehearsal PASS. Ready for A100 SXM production launch.