Pre-T2.6 Gate: Fin Resolution and Stability#
Fin resolution study (DONE)#
All 4 tested resolutions (64^3 through 192^3) produce distinguishable fins with 3 angular bands per z-slice (matching 3 fins per set at 120 degree spacing).
Resolution |
Circ. arc (lu) |
Radial extent (lu) |
Angular bands |
|---|---|---|---|
64^3 |
1.4 |
2.5 |
3 |
96^3 |
2.0 |
5.5 |
3 |
128^3 |
2.6 |
6.0 |
3 |
192^3 |
4.1 |
10.0 |
3 |
Note: these are corrected fin geometry values (MIME-ANO-003 fix applied — fin_thickness = 0.15mm used as circumferential blade thickness per §VI.F p.15).
Minimum viable resolution: 128^3 (circ. arc = 2.6 lu — marginal but usable with Bouzidi IBB). Target resolution for T2.6: 192^3 (circ. arc = 4.1 lu — well-resolved).
Two-pass bounce-back architecture (CONFIRMED)#
Single-pass BB with combined wall velocity for pipe + UMR causes compressibility instability (Ma > 0.1 at pipe wall radius). The corrected architecture uses two sequential BB passes with disjoint missing masks:
Pass 1: Pipe wall (stationary) —
apply_bounce_back(f_post, f_pre, pipe_missing, solid, wall_velocity=None)Pass 2: UMR (rotating) —
apply_bounce_back(f, f_pre, umr_missing, solid, wall_velocity=omega_x_r)
Both passes use the same f_pre (from lbm_step_split). The second pass receives
the output of the first as its f_post_stream argument. The solid_mask parameter
is vestigial in both BB functions (unused in computation, driven by missing_mask).
Missing masks are disjoint: pipe wall boundary links and UMR boundary links never share a node (minimum gap = 19 lu at worst-case confinement ratio 0.40).
For Bouzidi IBB: each pass gets its own q-values computed from its own SDF.
Pipe wall q-values can use analytical compute_q_values_cylinder. UMR q-values
use compute_q_values_sdf with umr_sdf.
Union SDF (CONFIRMED — optional with two-pass)#
With two-pass BB, each pass uses its own SDF for q-value computation. A union SDF
(min(pipe_sdf, umr_sdf)) is not required for the two-pass architecture but would
be needed for a single-pass variant.
No pipe wall SDF exists in the codebase. A trivial cylinder SDF should be added to
helix_geometry.py if single-pass or combined q-value computation is needed later:
pipe_sdf(pts) = R_vessel - sqrt(dx^2 + dy^2) (positive inside pipe = fluid,
negative outside = solid wall).
Mach number guard (CONFIRMED)#
Constraint: omega * R_fin_lu * sqrt(3) < 0.1 (Ma < 0.1 at fin tips).
Resolution |
R_fin (lu) |
Max safe omega |
Period (steps) |
|---|---|---|---|
64^3 |
9.7 |
0.00596 |
1,054 |
128^3 |
19.3 |
0.00299 |
2,101 |
192^3 |
29.0 |
0.00199 |
3,156 |
Target Ma = 0.05 (half of limit for safety margin):
Resolution |
Safe omega (Ma=0.05) |
Period (steps) |
|---|---|---|
64^3 |
0.00299 |
2,104 |
128^3 |
0.00149 |
4,209 |
192^3 |
0.00100 |
6,283 |
Guard implemented as an assertion at sweep initialisation (checked once, not per-step).
128^3 rotating stability check (DONE — PASS, 2026-03-23)#
Setup: 128x128x128, confinement ratio 0.30, two-pass BB (pipe static + UMR rotating), omega = 0.00149 rad/step (Ma = 0.05), tau = 0.8, simple BB, 1000 steps.
Results:
NaN: False
Inf: False
u_max: 0.027 lu (< 0.05 threshold)
Ma_max: 0.048 (< 0.1 threshold)
Density conservation: 0.0037% (< 0.01% threshold)
Torque sign: Correct (positive — body pumps momentum into fluid)
Step time: 0.98 s/step on RTX 2060 GPU
Convergence rate (measured at 64^3, 2026-03-23)#
Confinement ratio 0.30, two-pass BB, omega = 0.003, tau = 0.8, Ma = 0.05. Converged at step 4400 (~2.1 rotation periods), rel_change = 0.73%. Convergence criterion: 2% relative change in mean drag torque between consecutive rotation periods.
Convergence in rotation periods is resolution-independent (same physics). At 192^3 with period = 6,283 steps: expected convergence at ~13,000 steps.
Cloud rehearsal (DONE — PASS, 2026-03-23)#
Setup: A100 SXM 80GB on RunPod (US), 192^3, confinement ratio 0.30, two-pass BB, omega = 0.001 (Ma = 0.05), simple BB, 500 steps.
Results:
All 6 gates PASS
NaN: False, Inf: False
u_max: 0.028 lu (< 0.05)
Density conservation: 0.001% (< 0.01%)
Torque sign: Correct
Step time: 0.058 s/step on A100 SXM 80GB
Issues fixed during rehearsal:
GPU type:
A100-80GB(PCIe) →A100-80GB-SXM(SXM)cuDNN: Docker image CUDA 12.2 incompatible with host driver 570 (CUDA 12.8). Fixed by adding
pip3 install --upgrade 'jax[cuda12]'as first setup step.SkyPilot lifecycle:
stream_and_getreturns at job submission, not completion. Fixed with SSH polling in launch script.Git hash:
.git/not synced to cloud. Fixed by writing hash to file pre-sync.
GPU choice (REVISED after rehearsal)#
A100 SXM at $1.49/hr (RunPod). Revised rationale:
Measured step time at 192^3: 0.058 s/step — 17x faster than RTX 2060 extrapolation
The H100 SXM advantage (1.68x bandwidth) gives ~0.035 s/step — only 0.023s faster
At these step times, the H100 premium ($2.69 vs $1.49/hr) costs more than it saves
A100 is cheaper for this job: $1.21 vs $1.31 on H100
Resolution (DECIDED)#
192^3. Fin circumferential arc = 4.1 lu (well-resolved).
Revised cost estimate (from measured A100 SXM timing)#
Scenario |
Steps |
A100 SXM step time |
Time per ratio |
4 ratios |
Cost |
|---|---|---|---|---|---|
Optimistic (1.5 periods) |
9,400 |
0.058s |
9 min |
36 min |
$0.89 |
Expected (2 periods) |
12,600 |
0.058s |
12 min |
49 min |
$1.21 |
Conservative (3 periods) |
18,800 |
0.058s |
18 min |
73 min |
$1.81 |
Budget: $50.00. Expected cost: $1.21. Reserve: $48.79. Massive headroom — enough for multiple re-runs, Track C collection, orientation repeats, held-out test points, and future resolution escalation if needed.
Decision: PROCEED with T2.6#
All pre-launch gate checks pass. Architecture confirmed. Cloud rehearsal PASS. Ready for A100 SXM production launch.