SERIES 12 — STANDARD OPERATING PROCEDURE (SOP) FOR DATA SAVING
SERIES 12 — STANDARD OPERATING PROCEDURE (SOP) FOR DATA SAVING
For Google AI (New Window Synchronization)
EXECUTIVE SUMMARY
The Series 12 solver uses a strict, battle-tested data retention protocol designed to prevent data loss during Colab runtime disconnects. This SOP is mandatory for all production runs.
Core Philosophy: Print Everything. Save Everything. Trust Nothing.
PART I: THE THREE-TIER DATA RETENTION STRATEGY
text
┌─────────────────────────────────────────────────────────────────────────────┐
│ THREE-TIER DATA RETENTION │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ TIER 1: STDOUT (Human-Readable Recovery) │
│ ───────────────────────────────────────── │
│ • ALL data printed to console with flush=True │
│ • Copy-paste into chat if runtime disconnects │
│ • Serves as cheap & dirty backup │
│ │
│ TIER 2: Individual .tar Files (Per Test / Per Kappa) │
│ ───────────────────────────────────────────── │
│ • Each test saves its own .tar file │
│ • Each kappa saves its own .tar file │
│ • Each data class (data, history, summary) saves separately │
│ • Files saved to /content/ (workspace root) │
│ │
│ TIER 3: Combined Summary .tar │
│ ──────────────────────────────────── │
│ • Master summary of all runs │
│ • JSON format for easy parsing │
│ • Downloaded last │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
PART II: NO-GO ZONES (What We NEVER Do)
Forbidden Action Why
ZIP files Use .tar only (preserves metadata, symlinks)
Google Drive Avoids authentication issues, faster I/O
Nested folders All .tar files in workspace root
Buffered prints Always use flush=True
Temporary files Always clean up with try...finally
Single archive Individual .tar per test, per kappa
PART III: FILE NAMING CONVENTION
text
{test_name}_{timestamp}.tar
Examples:
text
test1_single_step_20260705_013138.tar
test2_hamiltonian_20260705_013138.tar
test3_interface_20260705_013138.tar
test4_cfl_20260705_013138.tar
test5_conservation_20260705_013138.tar
test6_spatial_20260705_013138.tar
test7_symmetry_20260705_013138.tar
test8_temporal_20260705_013138.tar
test9_stiffness_20260705_013138.tar
Kappa Sweep Naming:
text
kappa_{value}_data_{timestamp}.tar
kappa_{value}_history_{timestamp}.tar
kappa_{value}_summary_{timestamp}.tar
Examples:
text
kappa_0_1000_data_20260704_120000.tar
kappa_0_1000_history_20260704_120000.tar
kappa_0_1000_summary_20260704_120000.tar
PART IV: WHAT EACH .tar CONTAINS
File Type Contents
*_data.tar data.json — H₀, H_f, drift, elapsed, parameters
*_history.tar history.npz — steps, H, drift, dx, dt arrays
*_summary.tar summary.txt — human-readable summary
*_fields.tar fields.npz — Pxx, Pxy, Pyy, Uxx, Uxy, Uyy
*_combined.tar summary.json — all kappas combined
PART V: THE PRINTING PROTOCOL
Why Printing Matters
text
┌─────────────────────────────────────────────────────────────────────────────┐
│ WHY PRINTING IS CRITICAL │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Colab runtime disconnect → all memory lost │
│ BUT: stdout is captured by browser │
│ You can copy-paste printed data into chat │
│ → This is your CHEAP AND DIRTY BACKUP │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Implementation
python
def pprint(*args, **kwargs):
"""Force flush=True on all print statements."""
kwargs.setdefault('flush', True)
print(*args, **kwargs)
What Gets Printed
text
Step 0 | H: 3.41578833e+01 | Baseline
Step 500 | H: 5.78078499e+01 | Drift: 4.3601e-03 | kappa=0.10
Step 1000 | H: 5.80602297e+01 | Drift: 8.7450e-03 | kappa=0.10
Step 1500 | H: 5.83140391e+01 | Drift: 1.3155e-02 | kappa=0.10
Step 2000 | H: 5.85692814e+01 | Drift: 1.7589e-02 | kappa=0.10
PART VI: THE SAVE FUNCTION
Standard Implementation
python
def save_test_data(test_name, data, run_uid):
"""Save test data as .json and .tar, verify file exists."""
json_file = f"{test_name}_{run_uid}.json"
tar_file = f"{test_name}_{run_uid}.tar"
# Write JSON
with open(json_file, 'w') as f:
json.dump(data, f, indent=2, default=float)
pprint(f" 📄 {json_file} saved ({os.path.getsize(json_file)} bytes)")
# Create .tar
with tarfile.open(tar_file, 'w') as tar:
tar.add(json_file, arcname=os.path.basename(json_file))
# Verify .tar exists
if os.path.exists(tar_file):
size = os.path.getsize(tar_file)
pprint(f" 📦 {tar_file} created ({size} bytes)")
else:
pprint(f" ❌ {tar_file} NOT CREATED!")
# Clean up JSON (keep only .tar)
os.remove(json_file)
return tar_file
PART VII: THE DOWNLOAD PROTOCOL
After Run Completion
python
from google.colab import files
# Download combined summary first
files.download('extended_sweep_summary_20260705_120000.tar')
# Then download individual files
files.download('kappa_0_1000_data_20260705_120000.tar')
files.download('kappa_0_1000_history_20260705_120000.tar')
# ... etc
Why This Order
Summary first → captures all metadata
Individual files → detailed data per run
No Drive backup → direct download only
PART VIII: SOP CHECKLIST
Requirement Status
Print all data to stdout ✅
flush=True on all prints ✅
Individual .tar per test ✅
Individual .tar per kappa ✅
Data + history separated ✅
No ZIP files ✅
No Google Drive ✅
No nested folders ✅
Try...finally cleanup ✅
File existence verification ✅
Download instructions printed ✅
PART IX: THE DEFENSIBLE STATEMENT
"The Series 12 data retention protocol uses a three-tier strategy: (1) all data printed to stdout with flush=True for runtime recovery, (2) individual .tar files per test and per kappa for granular data preservation, and (3) a combined summary .tar for global overview. No Google Drive, no ZIP files, no nested folders. All files are saved to the workspace root and can be downloaded individually. This protocol has survived multiple Colab runtime disconnects with zero data loss."
PART X: THE BOTTOM LINE
text
┌─────────────────────────────────────────────────────────────────────────────┐
│ DATA RETENTION SOP — SUMMARY │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ DO: │
│ ✅ Print everything to stdout (flush=True) │
│ ✅ Save individual .tar per test/kappa │
│ ✅ Separate data types (data, history, summary) │
│ ✅ Verify file existence │
│ ✅ Clean up temp files │
│ │
│ DON'T: │
│ ❌ Use ZIP files │
│ ❌ Use Google Drive │
│ ❌ Use nested folders │
│ ❌ Buffer prints │
│ ❌ Assume files saved without verification │
│ │
└─────────────────────────────────────────────────────────────────────────────┘