Kimera-SWM observational surface — what Ophamin sees, what it misses¶

Status: strategic reframing draft, 2026-05-15. Read before committing to Ophamin v0.2 scope. Written after a re-read of Kimera-SWM's CLAUDE.md + EMPIRICAL_VALIDATION.md + KIMERA_STATE.md and a code-level walk of kimera_swm/infrastructure/ (68 subpackages) + kimera_swm/interfaces/ (5 protocols) + kimera_swm/domain/autonomous/a2a/.

The honest finding: Ophamin v0.1 covers ~10–15% of Kimera's actual observable surface. The cognitive cycle is one stratum among many. The reframe below names what's missing and proposes how to extend the six-wheel architecture without burning what works.

1. What Ophamin v0.1 actually observes¶

The KimeraAdapter in seeing/substrate/kimera_adapter.py targets 11 cognitive primitives via direct Python entry points:

entity (full Takwin), pentecost, ouroboros, rosetta, arachne,
walker, gwf, piovra, astrolabe, atlas, spde

Six wheels orbit this single observation surface:

seeing/ — discovers field schema, watches for HEAD changes
measuring/ — runs pre-registered scenarios across three tiers
comparing/ — drift over signed proof records
instrumenting/ — psutil per-cycle resource profile
auditing/ — ruff + bandit + mypy + vulture + radon + pip-audit on source
reporting/ — HTML / Markdown / LaTeX academic output

The shape is: one cognitive cycle in, one signed CycleResult out, six wheels look at the resulting trajectory.

This is correct for what Ophamin was first scoped to do (verify Kimera's cognitive claims). It is insufficient for what Kimera actually is.

2. What Kimera-SWM actually exposes — the verified inventory¶

Each row below is grep-verified against the working tree on 2026-05-15.

2.1 Persistence — six PostgreSQL repositories + ArangoDB + Redis + multi-level cache¶

kimera_swm/infrastructure/database/
  postgres_cognitive_repository.py
  postgres_contradiction_repository.py
  postgres_decision_repository.py
  postgres_geoid_repository.py
  postgres_insight_repository.py
  postgres_learning_repository.py
  arangodb_manager.py
  async_arango_bridge.py
  redis_manager.py
  distributed_transaction_coordinator.py
  multi_level_cache_manager.py
  enhanced_query_cache.py
  query_optimization_engine.py
  schema/ + migrations/ + insight_schema.sql
  unified_database_manager.py
  production_wiring.py

Plus infrastructure/persistence/ with vault_repository, echoform_repository, ecoform_repository, zetetic_repository, universal_database_manager, postgresql_chaos_repository, in_memory_chaos_repository.

Plus infrastructure/vault/ with vault_auditor, vault_optimizer, vault_router, vault_sync_manager.

What Ophamin observes: SCAR count via entity cycle result. Blind to: per-repository query latency, transaction conflict rate, ArangoDB graph traversal cost, Redis hit/miss ratios, cache hierarchy behavior, schema migration state, vault sync drift.

2.2 Five protocol interfaces — REST × 47 routers, GraphQL, MCP × 10 tools, CLI, WebSocket¶

kimera_swm/interfaces/
  rest/app.py + 24 controllers/
  graphql/  app.py + federation/ + resolvers/ + schema/ + services/
  cli/      app.py + commands/ + batch.py + interactive.py + plugins.py
  mcp/      server.py + 10 tool files + resources/ + prompts/ + transports/
  websocket/ + websockets/
  auth/ + middleware/ + compliance/ + monitoring/

Plus kimera_swm/api/routers/ with 47 router modules (auth, autonomous, cognitive, cognitive_intelligence, chaos, causality, computation, event, infrastructure, insight, integration_communication, monitoring, query, quality, rosetta, sat, security, security_protection, specialized, system, vault, …) and api/auth.py + api/graphql.py + api/security_dashboard.py.

MCP tool surface includes a2a_tools.py, autonomous_tools.py, cognitive_tools.py, geoid_tools.py, linguistic_tools.py, mathematical_tools.py, boundary_tools.py, system_tools.py, advanced_tools.py.

What Ophamin observes: nothing at this layer. Blind to: every API entry point, every MCP tool call, every WebSocket session, every GraphQL query, every CLI invocation. Auth flows, rate limits, schema-mismatch refusals — completely invisible.

2.3 A2A protocol — agent registry, capability, trust, rate-limit, HMAC¶

kimera_swm/domain/autonomous/a2a/
  agent_registry.py
  capability_directory.py
  conversation_manager.py
  policy_gatekeeper.py     (HMAC + trust scores + rate limit + TTL)
  task_orchestrator.py
  agent_models.py

Plus an MCP bridge at interfaces/mcp/tools/a2a_tools.py exposing the A2A primitives as MCP tools.

What Ophamin observes: nothing. Blind to: agent registration/deregistration, capability-grant flow, trust score evolution, rate-limit hits, policy violations, message latency between agents, HMAC verification failures.

2.4 Network transports — seven adapters wrapping `ArchipelPeerRouter`¶

kimera_swm/domain/piovra/transports/
  async_queue.py
  tcp.py
  grpc.py
  websocket.py
  kafka.py
  rabbitmq.py
  nats.py
  network_emulator.py

What Ophamin observes: nothing. Blind to: PrimePacket wire bytes (v2 compact), ed25519 signature verification rate, dead-letter accumulation, snapshot-mismatch refusals, per-transport throughput, partition rate.

2.5 Layer 4 reconciliation — three CRDTs + RIBLT + Bloom¶

kimera_swm/infrastructure/reconciliation/
  g_set.py
  scar_dag.py
  echoform_chain.py
  rateless_iblt.py        (η ≈ 1.4 post-Phase-14 fix)
  bloom_preflight.py
  offline_reconnect.py
  strict_mode.py

What Ophamin observes: nothing. Blind to: CRDT convergence time, RIBLT decode success rate, Bloom filter FP rate, gap-fill bandwidth, snapshot ID divergence.

2.6 Message queue + event store + CQRS¶

kimera_swm/infrastructure/message_queue/
  kafka_adapter.py
  rabbitmq_adapter.py
  rabbitmq_provider.py
  message_queue_manager.py
  message_queue_provider.py
  cqrs_manager.py
  event_store.py
  async_scar_writer.py + refined_async_scar_writer.py

What Ophamin observes: nothing. Blind to: event store growth, CQRS command/query latency, async SCAR write backlog, message queue depth.

2.7 Service mesh + API gateway¶

kimera_swm/infrastructure/service_mesh/
  istio_service_mesh_manager.py
  envoy_proxy.py
  service_discovery.py
  service_mesh_config.py

kimera_swm/infrastructure/api_gateway/
  auth_manager.py
  gateway_config.py

What Ophamin observes: nothing. Blind to: virtual-service rollout state, Envoy circuit-breaker trips, service-discovery TTL, gateway auth cache hits.

2.8 Already-existing monitoring stack — Prometheus + Grafana + Alertmanager¶

This is the load-bearing point. Kimera already has a 38-file monitoring stack:

kimera_swm/infrastructure/monitoring/
  prometheus_config.yml       ← exporter config
  prometheus_exporter.py
  prometheus_metrics_collector.py
  grafana_dashboard.json
  grafana_dashboard_manager.py
  grafana_dashboard_sync_service.py
  alertmanager.yml
  alert_rules.yml
  kimera_alerts.yml
  alert_service.py
  alert_channels.py             (under observability/)
  distributed_tracer.py
  structured_logger.py
  centralized_logging_system.py
  consistency_monitor_impl.py
  database_health_monitor.py
  system_homeostasis_monitor.py
  system_monitor.py + system_health_monitor.py
  performance_monitor.py
  metrics_collector.py
  observability_manager.py
  comprehensive_monitoring_manager.py
  comprehensive_monitoring_endpoints.py
  monitoring_system.py + monitoring_service.py + monitoring.py
  docker-compose.monitoring.yml
  dashboard.py

Plus a parallel kimera_swm/infrastructure/observability/ (8 files: alerts, alert_channels, health_monitor, kccl_tracer, metrics_dashboard, tracing).

Implication: Ophamin shouldn't reinvent any of this — it should consume these existing telemetry streams as one of its observation strata. The metrics already exist; Ophamin's value-add is the proof record discipline + cross-stratum correlation + falsifiable claims, not re-implementing scrape endpoints.

2.9 Cronos timing + KCCL — 25-file temporal infrastructure¶

kimera_swm/infrastructure/temporal/
  cronos/
    cronos_atomic_clock.py (6-layer, 51.4 µs/tick)
    cronos_sync.py (ReversePTP + PI controller)
    cronos_metrics.py
    cronos_integration_bridge.py
    dtc_flywheel.py
    global_phase_filter.py
    system_clock_reference.py
    thorium_core.py
    zeta_standard.py
  kccl_system.py
  oscillator_registry.py
  oscillators.py
  spde_engine.py + 4 variants (optimized, prime_wave, partitioned, gpu_accelerated)
  rhythm_generator.py
  cross_frequency_coupling.py
  prime_wave_cip_coordinator.py
  reservoir_computing.py
  temporal_initializer.py + temporal_learning.py
  watchdog.py + time_provider.py

What Ophamin observes: total cycle_seconds per Takwin cycle. Blind to: per-phase KCCL timing, Cronos drift Allan deviation, oscillator phase coherence, SPDE engine selection at runtime, watchdog trips.

2.10 Quantum + encoder + frozen-on-fall + memory + security¶

Cluster	Files	Ophamin sees
`infrastructure/quantum/`	5 (geoid_quantum_processor, qinterpreter_service, quantum_error_correction, quantum_integration_service, quantum_reinforcement_learning)	nothing
`infrastructure/encoder_snapshot/`	8 (builder, loader, manifest, snapshot, snapshot_encoder, smoke_test, verifier)	nothing
`infrastructure/frozen_on_fall/`	6 (builder, manifest, state, takwin_integration, deployment_policy)	nothing
`infrastructure/memory/`	14 (memory_pool_manager, gc_optimizer, leak_detector, holographic_manifold, maxwell_demon_governor, …)	nothing
`infrastructure/security/` + `domain/security/`	17 + 30 = 47 (ed25519, rate_limiter, secret_rotation, manipulation_detector, gyroscopic_water_fortress, …)	GWF threat block rate via `entity` cycle

3. The Takwin entry point is itself underused¶

Even on the cognitive surface Ophamin already targets, the adapter only extracts a handful of fields from OrchestratorResult. Per CLAUDE.md, OrchestratorResult exposes 638 fields per cycle as of 2026-05-04 (was 542 at the 2026-04-20 audit; grows every Phase). Ophamin's CycleResult.raw carries cycle_seconds, walker_halt_mode, phi_value, gwf_blocked, manipulation_detected, dissonance_events_count, concepts_count, prime_chain, and a few others — call it ~15 fields.

That's <3% of what one cycle emits. The other 620+ fields are discarded at the adapter boundary.

4. The proposed reframing — Ophamin v0.2 as a multi-stratum observatory¶

The six-wheel architecture stays. What changes is what each wheel can look at.

4.1 Add a fourth axis to the substrate model: stratum¶

Today the adapter takes a target string ("entity", "walker", "rosetta", …). Add an orthogonal stratum dimension:

Stratum	Examples	Today	Proposed
`cognitive`	walker, rosetta, gwf, entity	✓	✓
`interface`	REST routers, GraphQL, MCP tools, CLI	✗	new
`transport`	TCP, WebSocket, Kafka, NATS pairs	✗	new
`persistence`	Postgres repos, ArangoDB, Redis, vault	✗	new
`reconciliation`	G-Set, SCAR-DAG, Echoform-chain CRDTs	✗	new
`temporal`	Cronos atomic clock, KCCL phases, oscillators	✗	new
`security`	A2A trust, ed25519 verify, GWF anchors, rate-limit	partial	full
`telemetry`	Prometheus scrape, structured logs, Grafana panels	✗	new (consumer)
`lifecycle`	encoder snapshot, frozen-on-fall, vessel state	✗	new

A scenario picks (stratum, target). Existing scenarios stay on (cognitive, *). New scenarios reach into other strata.

4.2 Map new strata to existing wheels¶

Stratum	seeing/	measuring/	comparing/	instrumenting/	auditing/	reporting/
interface	enumerate routers / tools / GQL fields	blackbox-probe scenarios, schema-evolution claims	drift in router count / signature	per-request resource profile via wrk/oha	ruff/bandit on router code	latency histograms
transport	enumerate transport instances	partition-recovery scenarios, RIBLT bandwidth claims	wire-byte drift across commits	per-transport throughput probe	scan for unsigned packet acceptance	bandwidth/latency charts
persistence	schema introspection across all 6 PG repos + Arango + Redis	DB consistency-after-chaos scenarios	schema drift across commits	per-query timing	sqlfluff / dataset migration audit	DB health dashboard
reconciliation	enumerate CRDT instances	convergence-time scenarios	RIBLT η drift across commits	per-merge cost profile	scan for non-idempotent paths	convergence proofs
temporal	enumerate oscillators + KCCL phases	Cronos Allan-deviation scenarios	drift in phase coherence	per-phase wall-time	check for monotonic-clock assumptions	KCCL timing chart
security	enumerate auth flows, A2A policy rules	red-team scenarios (Family M-style, expanded)	GWF anchor diff across commits	per-rule eval cost	bandit + custom security pillar	threat-detection report
telemetry	scrape Kimera's `/metrics` endpoint	claims about Prometheus alert firing rate	metric cardinality drift	passive — uses existing exporter	check alert_rules.yml well-formedness	Grafana-panel embeds
lifecycle	inspect snapshot manifests, frozen-on-fall manifests	snapshot tamper-detection scenarios	snapshot ID divergence across nodes	snapshot read time	verify HMAC + ed25519 on every artifact	lifecycle audit trail

Every stratum slots into one or more of the existing six wheels. No new wheel is needed — but each existing wheel needs new pillars/probes.

4.3 The single biggest leverage move: consume the Prometheus stream¶

Kimera already exports metrics. Ophamin's seeing/ wheel should add a PrometheusScrapeProbe that:

Connects to Kimera's Prometheus exporter (configured at infrastructure/monitoring/prometheus_config.yml).
Streams metrics into Ophamin's measurement engine as a passive observation stream alongside the active scenario stream.
Aligns timestamps with Ophamin's pre-registered scenario windows so (scenario, prometheus_snapshot_before, prometheus_snapshot_during, prometheus_snapshot_after) becomes a single signed observation.
Triggers measuring/scenarios/observability_* scenarios that test claims about the alert rules themselves (e.g. "the KimeraHighPhiVariance alert fires when Φ stdev > 0.1 over a 5-min window").

This single addition unlocks observability into roughly 20 of Kimera's 68 infrastructure subpackages without any new probe code on Ophamin's side — they all already emit to Prometheus.

4.4 The second biggest leverage move: expand the adapter's field extraction¶

Today's adapter copies ~15 fields from OrchestratorResult. Replace with a field-projection layer that:

Maintains a JSON allowlist mapping OrchestratorResult fields → opaque raw keys.
Defaults to copying ~50 high-signal fields (the existing 15 + Walker M1/M2/M3/M4 counters + per-phase wall-time + per-stratum substrate signals).
Lets a scenario opt-in to additional fields via scenario.required_raw_fields = ["eikonal_mean_arrival_time", ...].
Validates at run-time that the requested fields exist on the result — loud failure if Kimera renames a field (which surfaces drift instead of hiding it).

This costs one file (a field_projection.py in seeing/substrate/), zero changes to scenario authoring, and immediately makes 5–10× more fields available to measuring pillars.

4.5 Three new measuring pillars (additive to the existing O · F · A · M · I · N six)¶

Pillar	What it does	Backing library
L (atency)	Quantifies per-stratum latency distributions (p50/p95/p99/max) with the same Wilson-CI discipline applied to histograms via `numpy.percentile` + bootstrap	numpy / statsmodels
B (andwidth)	Quantifies bytes-on-wire for transport scenarios, with RIBLT savings versus full-state baselines as a paired comparison	numpy
A (vailability)	Quantifies uptime / error rates / circuit-breaker trips across the multi-protocol interface layer	scipy.stats binomial

Plus one cross-cutting pillar:

Pillar	What it does
Σ (correlation)	Cross-stratum correlation: when GWF fires (security), does request latency spike (interface)? When CRDT merge slows (reconciliation), does cognitive Φ drift (cognitive)? Pearson + Spearman with multiple-comparison correction

4.6 Three new auditing pillars (additive to the existing six)¶

Pillar	Wraps	Surfaces
schema-audit	sqlfluff or sqlglot on `.sql` + `schema/.py`	drift in PostgreSQL repository schemas, migration gaps
api-audit	OpenAPI spec extraction from FastAPI app + `schemathesis` blackbox tests	undocumented endpoints, schema-vs-implementation drift
security-config-audit	parse `alert_rules.yml`, `alertmanager.yml`, `prometheus_config.yml`, A2A `policy_gatekeeper.py` rules	mis-configured rules, missing alert coverage, over-broad anchors

4.7 Three new scenarios (one per tier, exercising the new strata)¶

Tier	Scenario	Stratum	Claim
Scientific	InterfaceContractStability	interface	Across 30 days of Kimera commits, the OpenAPI spec breaking-changes count is ≤ N at every commit boundary (signed)
Engineering	TransportRecoveryTime	transport	After a simulated partition + reconnect, all 6 transport adapters resync the G-Set to Jaccard=1.0 within ≤ T seconds (signed)
Philosophical	MultiStratumSelfModel	telemetry	When the substrate processes text about its own infrastructure (a `prometheus_config.yml` block, an Istio `VirtualService` definition), the `system_homeostasis_monitor`'s `homeostasis_score` differs significantly from neutral text (Mann-Whitney U, paired)

5. What this reframing buys¶

Before (v0.1)	After (v0.2)
11 cognitive primitives observable	11 cognitive + ~50 infrastructure surfaces
~15 `OrchestratorResult` fields per cycle	~50 default + opt-in for the other 580
Six wheels, all targeting cognitive	Six wheels × nine strata = 54-cell coverage matrix
No telemetry consumer	Prometheus + Grafana + Alertmanager stream consumer
O · F · A · M · I · N (6 measuring pillars)	+ L · B · A · Σ (10 pillars)
ruff · bandit · mypy · vulture · radon · pip-audit (6 audit pillars)	+ schema-audit · api-audit · security-config-audit (9 pillars)
Six scenarios	+ InterfaceContractStability + TransportRecoveryTime + MultiStratumSelfModel (9 scenarios)
One target per scenario	`(stratum, target)` tuple — many-to-many
Falsifiable claims about cognition	Falsifiable claims about cognition + infrastructure + cross-stratum interaction

The shape of the framework doesn't change — it just gains depth.

6. Honest unknowns¶

MCP transport authentication: I haven't verified whether Kimera's MCP server runs in stdio-only mode (no network surface) or with HTTP/SSE (network-exposed). If stdio-only, MCP observability is limited to process-level instrumentation.
Kafka/RabbitMQ/NATS deployment status: code is present; whether these are actually wired into production runtime or are WIRE_CANDIDATEs for future Archipel deployment is unclear from CLAUDE.md alone. Operational observability requires they actually fire.
Prometheus exporter actively scrapeable: I see the .yml config and the prometheus_exporter.py module but haven't confirmed the exporter is bound to a port at runtime. Verify via lsof -iTCP:<port> against a running Kimera instance before building the consumer.
47 REST routers production-bound: similarly, the routers exist in source but whether the FastAPI app actually mounts all 47 on a running instance is a separate question. kimera_swm/api/__init__.py probably knows.
A2A protocol traffic: A2A is implemented; whether there are currently any agents talking on it during a typical Kimera run is a separate measurement. If A2A is dormant in single-node Kimera, observability is hypothetical until Archipel deployment lights it up.

Each of these is a seeing/discovery/-style probe rather than a guess. Step zero before building v0.2 is measuring which surfaces are live.

7. Suggested next step ordering¶

If we commit to v0.2:

Discovery probes first (1–2 sessions): for each of the nine strata, land a seeing/discovery/<stratum>_inventory.py that enumerates the primitives Kimera actually exposes at runtime (not from source). This builds the empirical baseline of "what's live."
Field projection (1 session): replace the adapter's ad-hoc field extraction with field_projection.py + allowlist.
Prometheus consumer (1 session): land a PrometheusScrapeProbe in seeing/ + an observability_* scenario that uses it.
One new scenario per tier (3 sessions): InterfaceContractStability (scientific) → TransportRecoveryTime (engineering) → MultiStratumSelfModel (philosophical).
Three new measuring pillars (1 session): L · B · A · Σ. Wire them into the existing engine with the same Wilson-CI / Cohen's d / Mann-Whitney discipline applied to their respective signal types.
Three new auditing pillars (1 session): schema-audit, api-audit, security-config-audit.

Eight focused sessions. Each one ends with green tests + a signed proof record demonstrating the new surface. No big-bang rewrite.

8. What stays the same¶

The non-deletion stance toward Kimera primitives.
Pre-registration discipline — claims before measurement.
Loud failure / no silent fallbacks.
Signed, content-addressed, HMAC-verified proof records.
The six wheels and three tiers.
MockSubstrate so the framework runs without Kimera present.
Independence from Kimera at the code level (only the adapter knows).

Authored by Claude (Opus 4.7 1M context), 2026-05-15, after re-reading CLAUDE.md, EMPIRICAL_VALIDATION.md, KIMERA_STATE.md, and walking the infrastructure tree. Awaiting owner decision on scope before any implementation work.