Open notebook · pre-registered

Ontopoietic Notes

Day one · 20 June 2026 · t₀ = 03:20 UTC · no hypothesis confirmed yet

This is the experiment's working notebook, published before the data. It is not "the theory of how Google works": these are hypotheses and models written down so you can check them, criticise them and — if needed — falsify them. Notes, not revealed truth.

Observable a measurable prediction, with a declared threshold of refutation. This is the science.

Model, loose an inferred mechanism: elegant, consistent with observed behaviour, but not observable. It is a lens, not a verified description.

Rule of the notebook: never confuse the two columns. Plausible ≠ proven. A negative result is still a result.

The shadow and the structure Observable

The foundation stone, from which everything else hangs. The ultimate object — the deep structure that generates resolutions — is inaccessible (it lives inside the engine). But its shadow, the observable behaviour, is measurable. And what is measurable can become science: from the shadow one builds invariants, metrics, models, predictions. (This is structural realism: we do not know the nature of the unobservable, but its structure — because it is structure that is preserved across observations.)

All of this holds under one condition: that the shadow↔structure correspondence is sufficiently stable. In physics it is free — nature does not change its laws to evade you. Here it is not: the structure is non-stationary (the algorithm changes) and adversarial (it resists reverse-engineering, because what is stably inferable is gameable). The shadow may move under your feet on purpose.

So stability is not assumed: it is measured. Test-retest of the metrics (do R̂, C, P_conv reproduce over time and across replications?), robustness across a known algorithm change. If the invariants hold, the "enough yes" is earned; if they jump unpredictably, that too is a result: at this resolution, no science is possible here. This is meta-falsifiability — the discipline puts even its own precondition to the test.

The thesis

In an agent-mediated web, engines do not choose a page: they resolve an entity. The question shifts from "am I findable?" to "am I resolved?". An entity's authority is not declared, it is propagated by neighbouring nodes along typed edges. The four principles are its frame.

Public equation — resolution authority Observable

A typed generalization of PageRank. A model consistent with observed behaviour; falsifiable in its predictions.

A (i) = (1 - d) b (i) + d \sum_{j \in In (i)} \frac{w (τ_{j i})}{W (j)} A (j)

Experiment pivot: virgin domain → $b (v) \approx 0$ → only propagation remains. Detail in formalization.

Deep model — virtual node Model, loose

Where edges converge but no node exists, the system synthesises a virtual node (an implicit entity) from demand, conditioned by context:

v_{q} = \frac{1}{Z} \sum_{i} b_{i} a_{i} n_{i} \cdot φ (x_{i}, ℓ_{i}, t_{i}, g_{i})

$b_{i}$ relevance · $a_{i}$ source trust/authority · $n_{i}$ query volume · $φ$ context (text, language $ℓ$ , date $t$ , provenance $g$ ) · $Z$ normalisation. The virtual node is the demand-weighted, normalised centroid. When a real node resolves, the centroid collapses onto it.

⚪ §6 — not Google's internals. It is a model that rhymes with real techniques (implicit entities, embeddings), not the verified implementation.

Control — KL divergence Model, loose

So the system does not repeat a wrong association forever: a "surprise" control that fires when new evidence diverges from the consolidated node.

D_{KL} (P_{t} ∥ Q) > θ

Low KL → reflex (cache, no recompute). High KL → perturbation → re-verification. It is the gate between fast-path and slow-path, and it rhymes with the free energy principle (Friston): staying alive by minimising surprise. Two control points: $Z$ at birth, KL during maintenance.

⚪ §6 — model, not verified mechanism.

The graph as process Observable Model, loose

A conjecture: the knowledge graph is not a static stored object, but a process — subgraphs materialised per context, then discarded. It rhymes with the root of ontopoiesis: being as self-making, not as substance.

Observable The testable shadow. If it were a static object, the same entity would resolve identically regardless of query, language, time. If it is a process, resolution is non-stationary: the variance itself — the consolidation curve (H4) and the locale split (H6) — is the footprint of the process. This can be measured.

Model, loose The non-measurable part. That those subgraphs are materialised and destroyed to minimise the energy of coherence verification: that is interpretation, not a claim — neither the purpose nor the lifecycle is observable, only the shadow. §6. We keep it as a lens, not as proof.

Falsifiable hypotheses Observable

The scientific core: each prediction with its refutation, decided before the data.

#	Prediction	Refutation
H1 · existence	b(v)≈0 + edges in place → R≥1 within T	R=0 up to T → no grounds to proceed
H2 · monotonicity	more propagation → faster resolution (∂τ_res/∂P<0)	no correlation
H3 · adjacency	near a strong cluster → own node faster	absorbed, or no faster than isolated
H4 · consolidation	τ_res drops on repeated queries (memory)	τ_res flat
H5 · revisability	resolution can drop if coherence falls	only rises, forever
H6 · context	resolution splits by language/geo/time	identical regardless

R(v,t) ∈ {0 not resolved · 1 disambiguated · 2 cited as source}.

Measurement protocol Observable

A quantity is scientific only if a reproducible procedure exists to compute it. One protocol, four measures — all observable proxies for latent quantities (§6), not the quantities themselves.

The dataset. A fixed basket of neutral queries Q (pre-registered), an engine E, a locale L (language ℓ, geo g), a cadence. For each query q at time t we code: cited sources · resolved sense (fixed rubric: discipline / philosophy / ambiguous) · level r(q,t) ∈ {0,1,2} · key attributes asserted (checklist).

Measure	Operational definition	Tests
Resolution authority · R̂(v,t)	mean of r(q,t) over the basket ∈ [0, 2]	H1 / H2
Identity distance · S(t)	separability S = 1 − conf, where conf = fraction of q mixing the two senses	H3 / H6
Semantic coherence · C(t)	fraction of key attributes agreeing across sources and queries ∈ [0, 1]	H5
Convergence · P_conv(t)	fraction of q resolving primarily to your entity ≡ 1 − normalised entropy of senses	H3

The same rubric yields all four. Pre-registering the protocol before the data is what prevents tuning it after the fact.

Honest caveats

Homonym: "ontopoietica / ontopoiesis" is an existing philosophical concept (Anna-Teresa Tymieniecka; autopoiesis by Maturana & Varela). So not an "uncontested term": the real test is disambiguation (own node) vs absorption (a footnote).
Casatese (b>0) ≠ Ontopoietica (b≈0): separate dossiers. The Casatese corroborates the thesis, not the from-zero claim.
Sanity-check ≠ measurement: a query naming the domain/URL checks correctness; only the neutral query measures resolution.

Next steps

Clean test: fresh AI Mode session, neutral query, no domain or name → does ontopoietica.org appear as a source? Certify with a timestamp.
Split (H6): same query IT-from-Italy vs EN-from-US → does the description change?
Curve (H4): repeated query over days → does τ_res drop?
Raise the books: graph theory applied to IR, PageRank, knowledge graph embeddings, the free energy principle — to write the new chapter, not rewrite existing ones.

The dated milestones are in the roadmap; the formula in formalization; verification in the experiment. These are living notes: they will change with the data, and every change will be visible.