# Pre-T2.6 Gate: Fin Resolution and Stability

## Fin resolution study (DONE)

All 4 tested resolutions (64^3 through 192^3) produce distinguishable fins
with 3 angular bands per z-slice (matching 3 fins per set at 120 degree spacing).

| Resolution | Circ. arc (lu) | Radial extent (lu) | Angular bands |
|---|---|---|---|
| 64^3 | 1.4 | 2.5 | 3 |
| 96^3 | 2.0 | 5.5 | 3 |
| 128^3 | 2.6 | 6.0 | 3 |
| 192^3 | 4.1 | 10.0 | 3 |

Note: these are corrected fin geometry values (MIME-ANO-003 fix applied —
fin_thickness = 0.15mm used as circumferential blade thickness per §VI.F p.15).

**Minimum viable resolution**: 128^3 (circ. arc = 2.6 lu — marginal but usable with Bouzidi IBB).
**Target resolution for T2.6**: 192^3 (circ. arc = 4.1 lu — well-resolved).

## Two-pass bounce-back architecture (CONFIRMED)

Single-pass BB with combined wall velocity for pipe + UMR causes compressibility
instability (Ma > 0.1 at pipe wall radius). The corrected architecture uses
two sequential BB passes with disjoint missing masks:

1. **Pass 1**: Pipe wall (stationary) — `apply_bounce_back(f_post, f_pre, pipe_missing, solid, wall_velocity=None)`
2. **Pass 2**: UMR (rotating) — `apply_bounce_back(f, f_pre, umr_missing, solid, wall_velocity=omega_x_r)`

Both passes use the same `f_pre` (from `lbm_step_split`). The second pass receives
the output of the first as its `f_post_stream` argument. The `solid_mask` parameter
is vestigial in both BB functions (unused in computation, driven by `missing_mask`).

Missing masks are disjoint: pipe wall boundary links and UMR boundary links never
share a node (minimum gap = 19 lu at worst-case confinement ratio 0.40).

For Bouzidi IBB: each pass gets its own q-values computed from its own SDF.
Pipe wall q-values can use analytical `compute_q_values_cylinder`. UMR q-values
use `compute_q_values_sdf` with `umr_sdf`.

## Union SDF (CONFIRMED — optional with two-pass)

With two-pass BB, each pass uses its own SDF for q-value computation. A union SDF
(`min(pipe_sdf, umr_sdf)`) is not required for the two-pass architecture but would
be needed for a single-pass variant.

No pipe wall SDF exists in the codebase. A trivial cylinder SDF should be added to
`helix_geometry.py` if single-pass or combined q-value computation is needed later:
`pipe_sdf(pts) = R_vessel - sqrt(dx^2 + dy^2)` (positive inside pipe = fluid,
negative outside = solid wall).

## Mach number guard (CONFIRMED)

Constraint: `omega * R_fin_lu * sqrt(3) < 0.1` (Ma < 0.1 at fin tips).

| Resolution | R_fin (lu) | Max safe omega | Period (steps) |
|---|---|---|---|
| 64^3 | 9.7 | 0.00596 | 1,054 |
| 128^3 | 19.3 | 0.00299 | 2,101 |
| 192^3 | 29.0 | 0.00199 | 3,156 |

Target Ma = 0.05 (half of limit for safety margin):

| Resolution | Safe omega (Ma=0.05) | Period (steps) |
|---|---|---|
| 64^3 | 0.00299 | 2,104 |
| 128^3 | 0.00149 | 4,209 |
| 192^3 | 0.00100 | 6,283 |

Guard implemented as an assertion at sweep initialisation (checked once, not per-step).

## 128^3 rotating stability check (DONE — PASS, 2026-03-23)

**Setup**: 128x128x128, confinement ratio 0.30, two-pass BB (pipe static + UMR rotating),
omega = 0.00149 rad/step (Ma = 0.05), tau = 0.8, simple BB, 1000 steps.

**Results**:
- NaN: False
- Inf: False
- u_max: 0.027 lu (< 0.05 threshold)
- Ma_max: 0.048 (< 0.1 threshold)
- Density conservation: 0.0037% (< 0.01% threshold)
- Torque sign: Correct (positive — body pumps momentum into fluid)
- Step time: **0.98 s/step** on RTX 2060 GPU

## Convergence rate (measured at 64^3, 2026-03-23)

Confinement ratio 0.30, two-pass BB, omega = 0.003, tau = 0.8, Ma = 0.05.
**Converged at step 4400 (~2.1 rotation periods)**, rel_change = 0.73%.
Convergence criterion: 2% relative change in mean drag torque between
consecutive rotation periods.

Convergence in rotation periods is resolution-independent (same physics).
At 192^3 with period = 6,283 steps: expected convergence at ~13,000 steps.

## Cloud rehearsal (DONE — PASS, 2026-03-23)

**Setup**: A100 SXM 80GB on RunPod (US), 192^3, confinement ratio 0.30,
two-pass BB, omega = 0.001 (Ma = 0.05), simple BB, 500 steps.

**Results**:
- All 6 gates PASS
- NaN: False, Inf: False
- u_max: 0.028 lu (< 0.05)
- Density conservation: 0.001% (< 0.01%)
- Torque sign: Correct
- **Step time: 0.058 s/step on A100 SXM 80GB**

**Issues fixed during rehearsal**:
1. GPU type: `A100-80GB` (PCIe) → `A100-80GB-SXM` (SXM)
2. cuDNN: Docker image CUDA 12.2 incompatible with host driver 570 (CUDA 12.8).
   Fixed by adding `pip3 install --upgrade 'jax[cuda12]'` as first setup step.
3. SkyPilot lifecycle: `stream_and_get` returns at job submission, not completion.
   Fixed with SSH polling in launch script.
4. Git hash: `.git/` not synced to cloud. Fixed by writing hash to file pre-sync.

## GPU choice (REVISED after rehearsal)

**A100 SXM** at $1.49/hr (RunPod). Revised rationale:
- Measured step time at 192^3: **0.058 s/step** — 17x faster than RTX 2060 extrapolation
- The H100 SXM advantage (1.68x bandwidth) gives ~0.035 s/step — only 0.023s faster
- At these step times, the H100 premium ($2.69 vs $1.49/hr) costs more than it saves
- **A100 is cheaper for this job**: $1.21 vs $1.31 on H100

## Resolution (DECIDED)

**192^3**. Fin circumferential arc = 4.1 lu (well-resolved).

## Revised cost estimate (from measured A100 SXM timing)

| Scenario | Steps | A100 SXM step time | Time per ratio | 4 ratios | Cost |
|---|---|---|---|---|---|
| Optimistic (1.5 periods) | 9,400 | 0.058s | 9 min | 36 min | $0.89 |
| **Expected (2 periods)** | **12,600** | **0.058s** | **12 min** | **49 min** | **$1.21** |
| Conservative (3 periods) | 18,800 | 0.058s | 18 min | 73 min | $1.81 |

**Budget**: $50.00. Expected cost: $1.21. Reserve: $48.79.
Massive headroom — enough for multiple re-runs, Track C collection, orientation repeats,
held-out test points, and future resolution escalation if needed.

## Decision: PROCEED with T2.6

All pre-launch gate checks pass. Architecture confirmed. Cloud rehearsal PASS.
Ready for A100 SXM production launch.