← Portfolio

Full Report · GIMS Backend Refactor · the whole branch

From copy-paste FastAPI to a verifiable, layered platform

A four-day, 152-commit sprint on refactor/foundation that hardened security and storage, unified the part-of-speech model and the execution path, reorganized the entire HTTP/JSON backend out of "god files" into small role-named packages, and added a real 21 CFR Part 11 compliance core — every step proven behaviour-neutral before it was committed.

The thesis that made it safe: routes are decorator-defined and the guard baselines are keyed by path/name, not filename — so a move + import-rewrite is provably wiring-neutral. A function-level AST check, an ordered-route fingerprint, three byte-pinned baselines, a no-cloud-SDK-in-core layering guard, and — where the suite was thin — bespoke behaviour-golden harnesses backed every claim of "nothing changed."

Branch refactor/foundation · Safety tag pre-refactor (b162595, 2026-06-23) → 888a295 · 152 commits · 533 files · 2026-06-24 → 2026-06-27

01Snapshot

Despite enormous activity, the codebase ended up smaller and provable — the work was dominated by collapse and dedup, gated on a green baseline at every commit.

152
commits
over 4 days
533
unique files touched
86,896
cumulative churn
(sum of every commit)
−3,100
NET lines
(the tree shrank)
563
tests pass
(from ~382 early on)
8 / 8
backend phases done
(0–8 + R1–R21 + Part 11)

Where it landed

A layered backend — api/routers/ (HTTP) · core/ (logic) · nodes/ (orchestration) · modules/ (registration) · utils/ (kernel) — with a unified SQL record store behind a pluggable provider registry, one part-of-speech engine, a hardened-container execution backend, an HMAC-chained Part 11 audit/compliance core, a uniform error contract, and CI. App entry is api.app:app; the live route surface is stable at 320 OpenAPI paths / 368 routes (order-hash b361335f…).

02Timeline — how it unfolded

The branch moved in waves: build the kernel & contracts, unify the domain model, cut storage over to SQL, harden execution, then reorganize the whole HTTP layer, collapse the duplication, and finish with compliance + the last cohesion items.

06-23
0–2

Foundation — hygiene · shared kernel · error contract

Pinned requirements.txt, consolidated utils/{logger,paths,config,atomic} as the kernel, and introduced core/errors.py AppError + a central handler so every failure renders one envelope. (The error-contract tail and CI were finished on 06-27.)

tag pre-refactor b162595
06-24
3

Cohesion core — one part-of-speech engine

Built core/words/: WordType, WordHandler, behaviors, the unified Descriptor, validation, WordRegistry, a single normalize-on-read choke point (reader.read_types tolerates list or keyed-dict on disk), and one read-only /pos surface. Editor + workbench + audit now validate through one engine.

8556c74 · db6e772 · 1d6324e · 1bc2583
06-24
4

Dispatch foundation — typed RunContext + executor registry

A typed RunContext at the GUI boundary and a one-lookup kind→executor registry, replacing inline if kind=="parser" branching.

e3b1322
06-24
5

Storage unification — JSONL/folders → one SQL instances table

Provider-neutral ports (core/storage/ports.py, zero cloud SDK) + local adapter + a unified SqlRecordStore + a JSONL→SQL migrator. A reconciliation gate proved the old SQL store and items.jsonl had diverged at the field level; owner chose SQL-authoritative. Then the live cutover on the LIMS data (md5-verified backup + tag pre-phase5-cutover): 222 rows into instances, images relocated, nouns/ folders retired, and the whole app rewired to read/write through factory.get_record_store().

8c99db9 · bcefc76 · c8dc301 → app-rewire 91a985c…2bc5183
06-25
6

Backends & hardening — R15 execution + R21 front door

One ExecutionBackend seam with the hardened container as the default (in-process gated behind a flag), a hardened command builder (utils/container_run.py: non-root, --cap-drop=ALL, read-only rootfs + tmpfs, no-new-privileges, pid/mem/cpu caps, --network=none), and a host-gateway artifact broker (path-containment, no-symlink, ext+magic-byte allow-list, size/count caps). Verified end-to-end on real rootless podman. Plus the R21 page-node/module factories + single registry-mounted front door.

5da3c20…32463a7
06-25
6+

Orchestration-engine hardening

Fail-fast/loud startup validation (no silent duplicate-name overwrite), guards fail-closed, the live trigger chain de-silenced + per-handler timeout (an EventDispatcher down-payment), and a shared-singleton mount dedup that collapsed 947→375 route entries with paths unchanged.

39043af…ce6a416
06-25→26
RE

Backend reorganization — passes 1–4 (the "god files")

The big structural sweep, one file at a time, each move proven verbatim. Pass 1: fixed a secretly broken guard (an incidental FastAPI 0.138 upgrade had hidden 311/337 routes), reaped dead code, moved gui/*_gui.pyapi/routers/ and gui_main.pyapi/app.py, split 6 god files. Pass 2: the other 6 god files + _normalize_for_psycopg×9 deduped + full prefix cleanup. Pass 3: verb-log dedup. Pass 4: the make_compliance_node factory collapsed 13 clones (−5,893 lines) + the archive_workbench split.

38a8481…e3b4aa1 · 6b3ce7b…55040ce · 552acf5 · 3527229…d7d6cee
06-26
R18

Twin-tree Descriptor collapse

Cut the live consumers onto the unified Descriptor and deleted the two legacy twin trees (12 files, −1,032 lines), proven neutral by an old-vs-new equivalence harness before deletion. The backend reorg deferred-list was now empty.

5132ff3
06-27
P11

21 CFR Part 11 compliance core

Replaced the cosmetic unkeyed checksum with a real HMAC tamper-evidence chain (over prev_checksum + chain_seq + signature fields), server-side two-component e-signatures at the append chokepoint, sealed signed exports on every compliance/audit read, a least-privilege gims_compliance_writer DB role, and trusted/validated time. P8 was live-verified on real Postgres 16 — the restricted role is denied UPDATE/DELETE/TRUNCATE by GRANT (SQLSTATE 42501). A 6-agent reconciliation audit then re-verified every phase against live code.

70b3b76…56f6215 · backup-integrity guard 8c96b44
06-27
0/2

Phase 0 + Phase 2 tail — CI, error-contract close-out

Added pyproject.toml (ruff/pyright) + a CI pipeline (pytest + ruff), converted the last 45 HTTPException sites to AppError, audited all 163 silent except blocks (156 intentional, 7 fixed), and fixed 4 latent NameError bugs in untested branches — including a 500 on every successful image upload.

fc5af9e…90baaf3
06-27
3/4/5

Final cohesion / dispatch / storage-ergonomics — backend COMPLETE

The wordtype migration + reader-bypass data-loss fix, the pluggable storage-provider registry, the ExecutionService extraction from the 426-line run_custom_tool (proven by a golden harness + a real container run), and the adjective/adverb descriptor-router collapse. See the companion session report for this day's detail.

9768602…888a295 — 8 commits

03Cumulative churn

Two numbers answer two questions: the net diff (how the tree differs end-to-end) and the cumulative churn (total activity, counting a line re-edited across N commits N times).

Net diff (start → end)

+38,205 ins−41,305 del

560 files · net −3,100 lines. The refactor was net-reductive: god-file splits, the −5,893 compliance-factory collapse, the −1,032 twin-tree collapse and ~770 lines of dead-code reaping removed more than was added.

Cumulative churn (all 152 commits)

+42,693 ins−44,203 del

86,896 lines churned across the branch — the true measure of work. Roughly 572 lines of churn per commit, 533 distinct files touched.

Commits per day

06-24
74 06-25
21 06-26
39 06-27
18

Highest-churn files (cumulative ins+del)

These are exactly the god files that were split and the deduped engines — high churn here reflects collapse, not feature sprawl.

File (often since split / renamed)churn
core/core_run_customs.pycore/run_custom/*3,038
api/routers/runlog_workbench.py → package2,311
api/i_o.pyapi/iostore/*1,817
api/routers/nodes_compliance.py → factory + configs1,764
api/routers/account_roles.py1,523
gui/archive_workbench_gui.pyapi/routers/archive_workbench/*1,498
api/routers/backup.py1,297
core/run_custom/runner.py1,263
core/core_audit.pycore/audit/*1,219
api/routers/verb.py1,100

04What got built

Layered structure

  • api/routers/ — HTTP/JSON, decorator-defined routes
  • core/ — pure logic (no cloud SDK; layering-guarded)
  • nodes/ — orchestration nodes · modules/ — registration
  • utils/ — the kernel (logger, paths, config, atomic)
  • No *_gui.py; no core_*/*_module/*_node filename prefixes

Storage

  • One unified SQL instances table; per-noun tables / items.jsonl / nouns/ retired
  • Provider-neutral ports + a pluggable provider registry (local/aws built-in, entry-point discovery)
  • Blobs via ObjectStore; boto3 isolated to api/
  • Transactional unit-of-work (R4); locked IdService (R1)

Domain & execution

  • One part-of-speech engine: WordType/Descriptor/WordRegistry, twin trees deleted
  • Hardened-container execution backend (default) + artifact broker
  • ExecutionService extracted from the custom-tool monolith

Safety & contracts

  • 21 CFR Part 11: HMAC-chained audit/compliance, two-component e-sign, sealed exports, least-privilege role, trusted time
  • Uniform AppError error envelope; silent-except audit
  • CI (pytest + ruff) on every push/PR; ruff-clean tree
  • Request-scoped EventDispatcher (R7); server-side gate sign-off (R6)

05Risk catalog & Part 11

The refactor was scoped against a risk catalog R1–R21 and a 21 CFR Part 11 parity gap P1–P10. Headline outcomes below; full file:line evidence lives in proposals/gims_project_refactor.md.

Risk catalog — R1–R21 done

  • R1 locked id service · R4 transactional record store
  • R6 server-side gate sign-off · R7 request-scoped EventDispatcher
  • R9 resilient scheduler + run-history · R10 upload hardening
  • R15 hardened-container execution + artifact broker
  • R17 atomic archive (DB+FS or rollback) · R18 twin-tree collapse · R19 audit engine

Part 11 — P1–P10 P1–6, 8–10

  • P1–P3, P5 HMAC tamper-evidence chain + verifier
  • P4 server-side two-component e-sign at the append chokepoint
  • P6 auth + sealed signed exports on every read
  • P8 least-privilege DB role — live-verified on Postgres (GRANT denies writes, SQLSTATE 42501)
  • P9 trusted/validated time · P10 audit log HMAC-chained too
  • P7 is the one intentional open-logging deviation (documented)

06How "nothing changed" was proven

The discipline that let 152 commits land without regressions: layered, independent checks, run after every commit.

Structural guards

  • Ordered-route fingerprint — a hash of every path::methods in registration order; byte-identical for wiring-neutral moves
  • Three byte-pinned baselines — OpenAPI paths, all-routes, route-order
  • Function-level AST check — proves a body moved verbatim across a split
  • Layering guardcore/ never imports a cloud SDK; factory imports stay lazy

Behavioural proof where the suite was thin

  • Behaviour-golden harnesses — pin endpoint responses / side effects PRE vs POST (compliance envelopes, the custom-tool runner, the descriptor routers)
  • Equivalence harnesses — old-vs-new call-for-call before deleting legacy trees
  • Live runs — real hardened-container parser on rootless podman; P8 on real Postgres
  • Lesson banked: AST proves bodies, not module-scope imports — a full-suite run catches the dropped import

Test baseline grew the whole way

~382 → 406 → 471 → 484 → 534 → 563 passing, with a stable set of pre-existing environment/isolation failures and zero new regressions introduced by the refactor.

07Token / effort estimate (whole refactor)

Not measured — extrapolated. There is no token-accounting tool exposed inside a session, so these are methodology-based estimates with deliberately wide bands. The usage dashboard is the only authoritative source for cumulative spend; /cost gives the per-session truth.

~13–18
work sessions
(several with multi-agent workflows)
~1.5–3M
output tokens
(everything actually written)
~100–300M
total tokens processed
(rough order-of-magnitude)

Why the "processed" total is so large

  • Per-turn context re-reads. Every turn re-sends the growing conversation (system prompt + all prior messages + tool results). ~13–18 sessions each processing on the order of a few million tokens (mostly cache) already lands around ~75–160M.
  • Multi-agent fan-outs. The refactor leaned on workflows & audits — a 6-agent reconciliation audit, an 8-agent engine scan, read-only split-map Workflows, per-file split subagents. Each subagent carries its own context, adding tens of millions more.
  • It's mostly cache. The large majority of "processed" is prompt-cache hits, billed at a fraction of fresh input — so dollar cost is far below what the processed count implies.
Scopeoutput (written)total processedbasis
A single focused session (e.g. the 06-27 backend close-out, 8 commits)~55–80k~4–8Mmeasured churn + ~90 turns × growing context
Whole refactor (152 commits)~1.5–3M~100–300M~13–18 sessions scaled + subagent/workflow fan-out

Bands could be off by 2–3×. Treat the whole-refactor figure as "order tens-to-hundreds of millions processed, low-single-digit millions written" — and confirm against the usage dashboard.

08What's left

The backend refactor is complete. The remaining work is front-end only.

Backend status: complete

Phases 0–8 (backend), the risk catalog R1–R21, and the 21 CFR Part 11 track are done. handoff.md at the repo root is the canonical, up-to-date record.