Open notebook · pre-registered
Ontopoietic Notes
This is the experiment's working notebook, published before the data. It is not "the theory of how Google works": these are hypotheses and models written down so you can check them, criticise them and — if needed — falsify them. Notes, not revealed truth.
Rule of the notebook: never confuse the two columns. Plausible ≠ proven. A negative result is still a result.
The shadow and the structure Observable
The foundation stone, from which everything else hangs. The ultimate object — the deep structure that generates resolutions — is inaccessible (it lives inside the engine). But its shadow, the observable behaviour, is measurable. And what is measurable can become science: from the shadow one builds invariants, metrics, models, predictions. (This is structural realism: we do not know the nature of the unobservable, but its structure — because it is structure that is preserved across observations.)
All of this holds under one condition: that the shadow↔structure correspondence is sufficiently stable. In physics it is free — nature does not change its laws to evade you. Here it is not: the structure is non-stationary (the algorithm changes) and adversarial (it resists reverse-engineering, because what is stably inferable is gameable). The shadow may move under your feet on purpose.
So stability is not assumed: it is measured. Test-retest of the metrics (do R̂, C, Pconv reproduce over time and across replications?), robustness across a known algorithm change. If the invariants hold, the "enough yes" is earned; if they jump unpredictably, that too is a result: at this resolution, no science is possible here. This is meta-falsifiability — the discipline puts even its own precondition to the test.
The thesis
In an agent-mediated web, engines do not choose a page: they resolve an entity. The question shifts from "am I findable?" to "am I resolved?". An entity's authority is not declared, it is propagated by neighbouring nodes along typed edges. The four principles are its frame.
Public equation — resolution authority Observable
A typed generalization of PageRank. A model consistent with observed behaviour; falsifiable in its predictions.
Experiment pivot: virgin domain → → only propagation remains. Detail in formalization.
Deep model — virtual node Model, loose
Where edges converge but no node exists, the system synthesises a virtual node (an implicit entity) from demand, conditioned by context:
relevance · source trust/authority · query volume · context (text, language , date , provenance ) · normalisation. The virtual node is the demand-weighted, normalised centroid. When a real node resolves, the centroid collapses onto it.
⚪ §6 — not Google's internals. It is a model that rhymes with real techniques (implicit entities, embeddings), not the verified implementation.
Control — KL divergence Model, loose
So the system does not repeat a wrong association forever: a "surprise" control that fires when new evidence diverges from the consolidated node.
Low KL → reflex (cache, no recompute). High KL → perturbation → re-verification. It is the gate between fast-path and slow-path, and it rhymes with the free energy principle (Friston): staying alive by minimising surprise. Two control points: at birth, KL during maintenance.
⚪ §6 — model, not verified mechanism.
The graph as process Observable Model, loose
A conjecture: the knowledge graph is not a static stored object, but a process — subgraphs materialised per context, then discarded. It rhymes with the root of ontopoiesis: being as self-making, not as substance.
Observable The testable shadow. If it were a static object, the same entity would resolve identically regardless of query, language, time. If it is a process, resolution is non-stationary: the variance itself — the consolidation curve (H4) and the locale split (H6) — is the footprint of the process. This can be measured.
Model, loose The non-measurable part. That those subgraphs are materialised and destroyed to minimise the energy of coherence verification: that is interpretation, not a claim — neither the purpose nor the lifecycle is observable, only the shadow. §6. We keep it as a lens, not as proof.
Falsifiable hypotheses Observable
The scientific core: each prediction with its refutation, decided before the data.
| # | Prediction | Refutation |
|---|---|---|
| H1 · existence | b(v)≈0 + edges in place → R≥1 within T | R=0 up to T → no grounds to proceed |
| H2 · monotonicity | more propagation → faster resolution (∂τres/∂P<0) | no correlation |
| H3 · adjacency | near a strong cluster → own node faster | absorbed, or no faster than isolated |
| H4 · consolidation | τres drops on repeated queries (memory) | τres flat |
| H5 · revisability | resolution can drop if coherence falls | only rises, forever |
| H6 · context | resolution splits by language/geo/time | identical regardless |
R(v,t) ∈ {0 not resolved · 1 disambiguated · 2 cited as source}.
Measurement protocol Observable
A quantity is scientific only if a reproducible procedure exists to compute it. One protocol, four measures — all observable proxies for latent quantities (§6), not the quantities themselves.
The dataset. A fixed basket of neutral queries Q (pre-registered), an engine E, a locale L (language ℓ, geo g), a cadence. For each query q at time t we code: cited sources · resolved sense (fixed rubric: discipline / philosophy / ambiguous) · level r(q,t) ∈ {0,1,2} · key attributes asserted (checklist).
| Measure | Operational definition | Tests |
|---|---|---|
| Resolution authority · R̂(v,t) | mean of r(q,t) over the basket ∈ [0, 2] | H1 / H2 |
| Identity distance · S(t) | separability S = 1 − conf, where conf = fraction of q mixing the two senses | H3 / H6 |
| Semantic coherence · C(t) | fraction of key attributes agreeing across sources and queries ∈ [0, 1] | H5 |
| Convergence · Pconv(t) | fraction of q resolving primarily to your entity ≡ 1 − normalised entropy of senses | H3 |
The same rubric yields all four. Pre-registering the protocol before the data is what prevents tuning it after the fact.
Honest caveats
- Homonym: "ontopoietica / ontopoiesis" is an existing philosophical concept (Anna-Teresa Tymieniecka; autopoiesis by Maturana & Varela). So not an "uncontested term": the real test is disambiguation (own node) vs absorption (a footnote).
- Casatese (b>0) ≠ Ontopoietica (b≈0): separate dossiers. The Casatese corroborates the thesis, not the from-zero claim.
- Sanity-check ≠ measurement: a query naming the domain/URL checks correctness; only the neutral query measures resolution.
Next steps
- Clean test: fresh AI Mode session, neutral query, no domain or name → does ontopoietica.org appear as a source? Certify with a timestamp.
- Split (H6): same query IT-from-Italy vs EN-from-US → does the description change?
- Curve (H4): repeated query over days → does τres drop?
- Raise the books: graph theory applied to IR, PageRank, knowledge graph embeddings, the free energy principle — to write the new chapter, not rewrite existing ones.
The dated milestones are in the roadmap; the formula in formalization; verification in the experiment. These are living notes: they will change with the data, and every change will be visible.