Full Report · GIMS Backend Refactor · the whole branch
A four-day, 152-commit sprint on refactor/foundation that hardened security and
storage, unified the part-of-speech model and the execution path, reorganized the entire HTTP/JSON backend
out of "god files" into small role-named packages, and added a real 21 CFR Part 11 compliance core —
every step proven behaviour-neutral before it was committed.
Despite enormous activity, the codebase ended up smaller and provable — the work was dominated by collapse and dedup, gated on a green baseline at every commit.
A layered backend — api/routers/ (HTTP) · core/ (logic) ·
nodes/ (orchestration) · modules/ (registration) · utils/ (kernel) —
with a unified SQL record store behind a pluggable provider registry, one part-of-speech engine, a
hardened-container execution backend, an HMAC-chained Part 11 audit/compliance core, a uniform error
contract, and CI. App entry is api.app:app; the live route surface is stable at 320 OpenAPI
paths / 368 routes (order-hash b361335f…).
The branch moved in waves: build the kernel & contracts, unify the domain model, cut storage over to SQL, harden execution, then reorganize the whole HTTP layer, collapse the duplication, and finish with compliance + the last cohesion items.
Pinned requirements.txt, consolidated utils/{logger,paths,config,atomic} as
the kernel, and introduced core/errors.py AppError + a central handler so every failure
renders one envelope. (The error-contract tail and CI were finished on 06-27.)
Built core/words/: WordType, WordHandler, behaviors, the unified
Descriptor, validation, WordRegistry, a single normalize-on-read choke point
(reader.read_types tolerates list or keyed-dict on disk), and one read-only
/pos surface. Editor + workbench + audit now validate through one engine.
A typed RunContext at the GUI boundary and a one-lookup kind→executor registry, replacing
inline if kind=="parser" branching.
instances tableProvider-neutral ports (core/storage/ports.py, zero cloud SDK) + local adapter + a unified
SqlRecordStore + a JSONL→SQL migrator. A reconciliation gate proved the old SQL store
and items.jsonl had diverged at the field level; owner chose SQL-authoritative. Then the
live cutover on the LIMS data (md5-verified backup + tag pre-phase5-cutover): 222
rows into instances, images relocated, nouns/ folders retired, and the whole
app rewired to read/write through factory.get_record_store().
One ExecutionBackend seam with the hardened container as the default (in-process
gated behind a flag), a hardened command builder (utils/container_run.py: non-root,
--cap-drop=ALL, read-only rootfs + tmpfs, no-new-privileges, pid/mem/cpu caps,
--network=none), and a host-gateway artifact broker (path-containment, no-symlink,
ext+magic-byte allow-list, size/count caps). Verified end-to-end on real rootless podman. Plus the R21
page-node/module factories + single registry-mounted front door.
Fail-fast/loud startup validation (no silent duplicate-name overwrite), guards fail-closed, the live
trigger chain de-silenced + per-handler timeout (an EventDispatcher down-payment), and a
shared-singleton mount dedup that collapsed 947→375 route entries with paths unchanged.
The big structural sweep, one file at a time, each move proven verbatim. Pass 1: fixed a secretly
broken guard (an incidental FastAPI 0.138 upgrade had hidden 311/337 routes), reaped dead code, moved
gui/*_gui.py→api/routers/ and gui_main.py→api/app.py,
split 6 god files. Pass 2: the other 6 god files + _normalize_for_psycopg×9 deduped +
full prefix cleanup. Pass 3: verb-log dedup. Pass 4: the make_compliance_node
factory collapsed 13 clones (−5,893 lines) + the archive_workbench split.
Cut the live consumers onto the unified Descriptor and deleted the two legacy twin trees
(12 files, −1,032 lines), proven neutral by an old-vs-new equivalence harness before deletion.
The backend reorg deferred-list was now empty.
Replaced the cosmetic unkeyed checksum with a real HMAC tamper-evidence chain (over
prev_checksum + chain_seq + signature fields), server-side two-component
e-signatures at the append chokepoint, sealed signed exports on every compliance/audit read, a
least-privilege gims_compliance_writer DB role, and trusted/validated time. P8 was
live-verified on real Postgres 16 — the restricted role is denied UPDATE/DELETE/TRUNCATE by GRANT
(SQLSTATE 42501). A 6-agent reconciliation audit then re-verified every phase against live code.
Added pyproject.toml (ruff/pyright) + a CI pipeline (pytest + ruff), converted the last 45
HTTPException sites to AppError, audited all 163 silent except
blocks (156 intentional, 7 fixed), and fixed 4 latent NameError bugs in untested branches —
including a 500 on every successful image upload.
The wordtype migration + reader-bypass data-loss fix, the pluggable storage-provider registry, the
ExecutionService extraction from the 426-line run_custom_tool (proven by a
golden harness + a real container run), and the adjective/adverb descriptor-router collapse. See the
companion session report for this day's detail.
Two numbers answer two questions: the net diff (how the tree differs end-to-end) and the cumulative churn (total activity, counting a line re-edited across N commits N times).
560 files · net −3,100 lines. The refactor was net-reductive: god-file splits, the −5,893 compliance-factory collapse, the −1,032 twin-tree collapse and ~770 lines of dead-code reaping removed more than was added.
86,896 lines churned across the branch — the true measure of work. Roughly 572 lines of churn per commit, 533 distinct files touched.
These are exactly the god files that were split and the deduped engines — high churn here reflects collapse, not feature sprawl.
| File (often since split / renamed) | churn |
|---|---|
core/core_run_customs.py → core/run_custom/* | 3,038 |
| api/routers/runlog_workbench.py → package | 2,311 |
api/i_o.py → api/iostore/* | 1,817 |
| api/routers/nodes_compliance.py → factory + configs | 1,764 |
| api/routers/account_roles.py | 1,523 |
gui/archive_workbench_gui.py → api/routers/archive_workbench/* | 1,498 |
| api/routers/backup.py | 1,297 |
| core/run_custom/runner.py | 1,263 |
core/core_audit.py → core/audit/* | 1,219 |
| api/routers/verb.py | 1,100 |
api/routers/ — HTTP/JSON, decorator-defined routescore/ — pure logic (no cloud SDK; layering-guarded)nodes/ — orchestration nodes · modules/ — registrationutils/ — the kernel (logger, paths, config, atomic)*_gui.py; no core_*/*_module/*_node filename prefixesinstances table; per-noun tables / items.jsonl / nouns/ retiredObjectStore; boto3 isolated to api/IdService (R1)WordType/Descriptor/WordRegistry, twin trees deletedExecutionService extracted from the custom-tool monolithAppError error envelope; silent-except auditruff-clean treeEventDispatcher (R7); server-side gate sign-off (R6)The refactor was scoped against a risk catalog R1–R21 and a 21 CFR Part 11 parity gap P1–P10. Headline outcomes below; full file:line evidence lives in proposals/gims_project_refactor.md.
The discipline that let 152 commits land without regressions: layered, independent checks, run after every commit.
path::methods in registration order; byte-identical for wiring-neutral movescore/ never imports a cloud SDK; factory imports stay lazyimport~382 → 406 → 471 → 484 → 534 → 563 passing, with a stable set of pre-existing environment/isolation failures and zero new regressions introduced by the refactor.
Not measured — extrapolated. There is no token-accounting tool exposed inside a session, so
these are methodology-based estimates with deliberately wide bands. The usage dashboard is the only
authoritative source for cumulative spend; /cost gives the per-session truth.
| Scope | output (written) | total processed | basis |
|---|---|---|---|
| A single focused session (e.g. the 06-27 backend close-out, 8 commits) | ~55–80k | ~4–8M | measured churn + ~90 turns × growing context |
| Whole refactor (152 commits) | ~1.5–3M | ~100–300M | ~13–18 sessions scaled + subagent/workflow fan-out |
Bands could be off by 2–3×. Treat the whole-refactor figure as "order tens-to-hundreds of millions processed, low-single-digit millions written" — and confirm against the usage dashboard.
The backend refactor is complete. The remaining work is front-end only.
Phases 0–8 (backend), the risk catalog R1–R21, and the 21 CFR Part 11 track are done.
handoff.md at the repo root is the canonical, up-to-date record.