The empirical formula
The equation of resolution authority
The discipline does not start from a conclusion, but from an equation awaiting verification. Here it is written in mathematical language, with measurable variables and the exact threshold that would falsify it.
The model
Let be the knowledge graph. Each entity is a node; each directed edge carries a semantic type (author, sameAs, member, citation…). The resolution authority is defined by the recursion:
- — resolution authority
- How strongly the system "resolves" entity i. A latent quantity; its observable is defined below.
- — intrinsic authority
- Signal a node accumulates on its own: domain age, history, backlinks. For a new domain it is ≈ 0.
- — weight of the relation type
- The heart of the generalisation: each type of edge weighs differently. PageRank is the case where all types weigh the same.
- — outgoing normalisation
- The sum of the weights of the edges leaving j: the authority j distributes is split among its neighbours.
- — damping
- Factor weighing how much authority comes from the network versus the node itself.
Closed form
Setting the typed transition matrix , equation (1) solves in closed form:
Each node's authority is thus a linear transformation of the vector of intrinsic authorities : what is yours, propagated along every path of the graph, weighted by relation type.
Continuity: PageRank as the degenerate case
If a single edge type exists and , then and (1) returns exactly classic PageRank. The lineage is coherent: the neural network learns the weights (the end of uncontrollable weights), PageRank is the propagation structure (the end of unit weights), and Ontopoietica stands in between: propagation over typed edges.
The empirical pivot: the virgin domain
Here the formula meets the experiment. For a new domain (this domain), intrinsic authority is null:
Substituting into (1), the first term vanishes and only propagation remains:
The virgin entity's authority, at the start, cannot come from itself: it comes entirely from the propagation term — the edges toward already-resolved entities (the author Paolo Galbiati; the sameAs to the already-settled term on profpaul.icu). It is the variable isolated by construction: if becomes positive with , it was propagation, not age.
The observable
is latent: it is not measured directly. It is measured through the resolution indicator at time :
- 0 — not resolved (no mention)
- 1 — mentioned and disambiguated as an entity
- 2 — cited as a source on a neutral query
and through the time to first resolution:
The falsification threshold
The formula is not an opinion: it makes two checkable predictions, decided before the data.
Strong hypothesis (existence). With and the edges in place at time , the entity resolves within horizon :
Graded hypothesis (monotonicity). Across several virgin entities with different propagation , the time to resolution decreases as grows:
No grounds to proceed
The formula is falsified if, with edges in place and crawled and , we observe for every . In that case propagation alone is not enough, intrinsic authority is necessary, and age cannot be bypassed by edges. It is not a defeat: it is a verdict, and it says exactly which term of (1) needed revising.
What the formula is, and is not
Equation (1) is a model consistent with the observed behaviour of graph-based retrieval systems — not a claim to be the internal equation of a specific engine, whose weights are not accessible. It is falsifiable in its predictions, not as a reconstruction of a proprietary implementation. This distinction is part of the method: a model that predicts and lets itself be refuted has value; one that claims to describe what it cannot observe does not.
References
- L. Page, S. Brin, R. Motwani, T. Winograd — The PageRank Citation Ranking: Bringing Order to the Web (Stanford InfoLab, 1999).
- Google — Introducing the Knowledge Graph: things, not strings (2012).
- Schema.org — DefinedTerm: the standard used to structure the entities of this site.
This formalisation renders in notation the argument of the essay The Data-Identity Principle, refined by the discipline's four principles. The numerical check is in the experiment.