The Structure Imperative
A ‘geometric proof’ that journalism survives the AI apocalypse if and only if it adopts modular production

This essay presents an argument using a sequence of definitions, axioms, and propositions that build toward a central theorem. The format is deliberate and comedic; the topic is not. Journalism faces an existential challenge — the rise of answer intermediary systems that synthesize content without attribution — and the response requires doing something about it. By 'intermediary systems' I am referring to LLMs and chatbots, but keeping a wide angle for future-proofing: we may very well still be in a Netscape phase.
The argument proceeds from modular journalism as defined in previous work: content composed of discrete information atoms, each answering a specific user question. From this foundation, I will attempt to demonstrate through logical progression that structure is not merely beneficial but necessary for journalism to maintain discoverable presence in an intermediary-mediated information landscape.
The proof is conditional. It depends on platform behavior, regulatory frameworks, and economic models that remain uncertain. These boundary conditions are explicit. Still, this uncertainty is not enough to not worry about the apocalypse or do nothing about it. What follows is not a prediction but a logical analysis: given the expansion of automated answer systems, journalism's survival depends on adopting paragraph-level structural metadata.
The conclusion may seem stark. It suggests that news organizations refusing modular production face complete disintermediation — not through poor quality or lack of value, but through structural invisibility. The article as an atomic unit disappears. Publishers become dataset providers. Attribution survives only through machine-readable provenance.
Whether this conclusion proves correct depends less on the logic — which I believe holds — than on the external variables discussed in the boundary conditions.
Definitions
D1. User Information Need: A discrete, answerable question that motivates information-seeking behavior.
D2. Information Atom: The minimal content unit that provides a complete, self-contained answer to one user information need.
D3. Modular Content: Content composed entirely of information atoms, each mapped to a specific user information need.
D4. User Effect: A rhetorical pattern that obfuscates, distorts, or substitutes for direct answers to user information needs.
D5. Information Debris: Content elements that do not fulfill any user information need and exist as byproduct of production conventions or editorial habits.
D6. Answer Intermediary Systems (AIS): Automated generative platforms that retrieve, synthesize, and transform content produced by third parties into direct responses to user queries, functioning as the primary interface between information seekers and source material.
D7. The Intermediary Barrier: The point at which answers delivered by automated intermediary systems satisfy user intent before source journalism is discovered or consulted.
D8. Content Liquidity: The capacity of information to flow across formats, contexts, and delivery mechanisms while maintaining semantic integrity and provenance.
Axioms
A1. AI Structural Dependency: Probabilistic text generators require deterministic structure to produce reliable outputs.
A2. Attribution Erosion: When information flows through intermediary systems without preserved provenance metadata, attribution to source degrades in proportion to the number of transformations.
Scope and limitations
This argument addresses text-based journalism serving user information needs. It does not claim that modular formats suit all journalistic purposes.
Exceptions: Narratives that build suspense through revelation, opinion essays, propaganda, or cultural criticism operate under different engagement dynamics where answer directness is often not the primary value. Is this enough to guarantee that some cockroach variety of journalism will survive the apocalypse and will not be taken over by Answer Intermediary Systems? This may be topic for a new research.
Relevance: The volume of journalism serving information needs is substantial enough that failure to address the Intermediary Barrier in this domain threatens the economic viability of journalism institutions broadly (Scholium VI).
Propositions
P1. For goal-directed information seeking, engagement correlates with answer directness.
Empirical Basis: User research and analytics presented during the Modular Journalism collaborative and later research demonstrate that users engaged in active information seeking prefer content that directly answers their question over content that requires extraction or interpretation. This holds specifically for information needs, not for all journalism consumption modes, although, arguably, the exceptions are few. [1]
P2. Modular content maximizes engagement potential for information-seeking users.
Proof: Information atoms provide complete answers to discrete user questions (by definition), and modular content is composed entirely of such atoms. When engagement correlates with directness of answers (P1), then content made entirely of direct answers maximizes engagement potential for information-seeking users.
P3. Non-modular content reduces discoverability in intent-matching systems.
Proof: If modular content maximizes engagement for information-seeking users (P2), then by contrapositive, content lacking clear mapping to user information needs cannot maximize engagement. Systems designed to match user intent to content will therefore rank poorly or bypass content where the mapping between user questions and textual answers is unclear or absent.
P4. Contemporary journalism production systematically fails to fulfill user information needs.
Empirical Demonstration: Analysis of mainstream journalism reveals pervasive rhetorical patterns that obscure answers (User Effects, D4) — such as burying key facts in later paragraphs or mixing opinion with reporting — alongside content elements serving no user information purpose (Information Debris, D5). [2]
P5. Manual transformation of existing content to modular format is prohibitively costly at scale.
Proof: Since journalism artifacts often contain rhetorical obfuscation (P4), removing these requires identifying rhetorical patterns, performing semantic analysis, and rewriting for clarity. Development of Agent 2 shows this requires editorial expertise applied paragraph by paragraph, making the labor hours prohibitive at scale.
P6. Agentic transformation of content to modular format is technically feasible but structurally dependent.
Proof: Because AI systems require deterministic structure to produce reliable outputs (A1), agents can transform content only when provided with explicit structural scaffolding: taxonomies defining user questions, rhetorical problems, and validation rules. The agentic pipeline demonstrates feasibility only under exactly these structural conditions. And it's a work in progress [3]
P7. The optimal production strategy is modular-first, but faces adoption barriers.
Proof: If modular content maximizes engagement (P2) and transforming existing content is costly (P5), then producing modular content from the outset avoids transformation costs while capturing engagement benefits. However, empirical observations highlight the risk that some journalists may resist modular formats because of constraints on narrative voice (Scholium II).
P8. Human-AI hybrid systems can identify structure in disordered content.
Proof: Agents require explicit structural scaffolding (P6), but humans excel at recognizing implicit patterns and semantic relationships in ambiguous text. Therefore, hybrid systems where humans perform pattern identification and structural annotation, while agents execute rule-based transformations, achieve feasibility where either operating alone would fail.
P9. Answer intermediary systems face structural challenges symmetric to those of transformation agents.
Proof: Both transformation agents and AIS are probabilistic text generators subject to the structural dependency principle (A1). The tasks an AIS must perform to answer a query (extracting relevant portions, filtering noise, synthesizing a response) are identical to the tasks performed by the transformation agents (identifying answerable content, flagging user effects, generating clean outputs). Same input type, same structural dependency, therefore symmetric challenges.[4]
P10. Content lacking structural metadata suffers degraded representation in intermediary-mediated discovery.
Proof: If AIS struggle with unstructured content due to structural dependency (P9), and the Intermediary Barrier occurs when automated answers preempt source journalism (D7), then unstructured journalism will either be synthesized without proper attribution or bypassed entirely in favor of sources that provide clearer signal. Both outcomes result in degraded or absent representation.
P11. Paragraph-level structural schema are necessary for provenance preservation.
Proof: If content representation degrades in intermediary systems absent structural metadata (P10), and attribution erodes through transformation without provenance data (A2), then preserving attribution requires machine-readable metadata at the granularity where individual claims (atoms) are made. Since journalistic claims are typically expressed in single paragraphs, the schema must operate at the paragraph or atom level. Article-level metadata is insufficient because AIS synthesize portions, not wholes.
[1]: The collaborative work described in Modular Journalism: an algebra for news modules involved four news organizations (Deutsche Welle, Maharat Foundation, Clwster, and Il Sole 24 Ore) testing modular formats against traditional long-form articles. User research showed a strong preference for direct answers in information needs.
[2]: Modular Journalism: an algebra for news modules provides the example where an interview subject's loaded claims appear without immediate counterpoint or context about scientific consensus. Mainstream sources use "formulations about sources" — phrases like "the rumors say," or "it is whispered that" — that constitute Information Debris (D5).
[3]: Modular Journalism 2.0: bias in structured news sets the goal of a five-agent pipeline to test feasibility under conditions where structural scaffolding (taxonomies, frameworks, validation rules) is explicit. Hallucination rates spike when any structural element is removed.
[4]: Consider examples discussed in There is no such thing as AI magic: when searching "What did Glenn Gould bring to Beethoven's sonatas?", the AIS must scan text, identify which sentences address the question, filter rhetorical flourishes, and synthesize a response. This is structurally identical to the extraction and cleaning tasks performed by the transformation agents in Modular Journalism 2.0: bias in structured news .
Boundary conditions and real-world constraints
The sufficiency of the Theorem (T1) is conditional on external factors necessary for the model's success. These conditions lie largely outside publisher control, representing the strategic uncertainties journalism institutions must navigate.
BC1. Platform Alignment and Compliance: Answer intermediary platforms must either adopt or be compelled to respect the structural metadata standards (e.g., schema.news). This relies on:
- Regulatory Pressure: Policy forcing attribution (e.g., EU AI Act, ...).
- Quality Competition: Market incentives rewarding accuracy over speed, making structured sources preferable.
- Caveat: Platform monopoly may eliminate competitive pressure, allowing them to extract value without compensation (Scholium V).
BC2. Economic Viability of Dataset Provision: The shift in product from "article" to "liquid information atoms" (C3) must be supported by an economic model that generates sustainable revenue to fund continued journalism production.
BC3. Market Share of Intermediary Systems: The argument holds if the share of information discovery mediated by intermediary systems remains significant or continues to increase relative to other distribution channels.
Corollaries
C1. Breaking the Intermediary Barrier requires modular production.
From P10: Structured content receives preferential representation in AIS. Modular journalism with explicit structural metadata is therefore necessary for maintaining discoverability when information discovery shifts to intermediary-mediated channels.
C2. SEO logic extends to atomic content structure.
From P11: Just as schema.org made HTML readable for traditional search engines, paragraph-level modular schemas make journalism semantically readable for AIS (by marking up which text answers which user questions, with what sourcing and confidence).
C3. Content producers are becoming dataset providers.
From C1, P11, and D8: As delivery shifts to automated answer systems, and those systems require structured inputs (P9), journalism's primary product evolves from reader-facing articles to machine-readable information atoms with explicit provenance metadata—essentially structured datasets of verified claims.
C4. The transformation pipeline is bilateral.
From P6 and P9: The structural apparatus that enables agents to clean content for internal modular production (taxonomies, schemas) is the same apparatus that enables answer intermediaries to externally use content reliably (preserving attribution). Investment in this infrastructure serves both goals simultaneously.
Theorem
T1. Given the rise of answer intermediary systems in information discovery, journalism maintains discoverable presence if and only if it adopts modular-first production with paragraph-level structural metadata.
Proof:
Necessity: Content lacking structural metadata suffers degraded representation in intermediary systems (P10). Therefore, maintaining discoverable presence requires structural metadata at the granularity where claims are made (P11). Since modular production with paragraph-level metadata provides this, it is necessary.
Conditional Sufficiency: Structured modular content enables both reliable agentic transformation of existing journalism (P6) and preferential surfacing by answer intermediaries through reduced extraction difficulty (P9). With provenance preserved via paragraph-level schema (P11), attribution and discovery mechanisms remain intact as delivery shifts to intermediary platforms. Sufficiency is conditional on the Boundary Conditions (BC1, BC2, BC3).
Scholia
Scholium I (On the Symmetry): The realization that answer intermediary systems face the same "junk in, junk out" problem as transformation agents (P9) reveals that structure is not merely a production optimization—it is a survival strategy. The symmetry emerges directly from Axiom A1: all probabilistic text generators require deterministic structure for reliable operation.
Scholium II (On Journalist Resistance): P7 notes adoption barriers stemming from cultural attachment to narrative formats. However, C3 suggests this resistance creates an existential risk: journalists who refuse modular production face complete disintermediation. The question becomes whether journalism as an institution survives without adopting the necessary structure.
Scholium III (On the Endgame): C3's conclusion—that content producers become dataset providers—inverts the traditional publishing model. The "article" as atomic unit disappears; the "front page" as editorial curation disappears. What remains is an API of liquid information atoms, with provenance metadata, consumed by personalization engines and answer intermediary systems. This is the logical terminus of observable trajectories.
Scholium IV (On the Standard): A modular content standard must emerge for intermediary-mediated discovery — a schema equivalent. This standard must include:
- Paragraph-level semantic tagging (Which user need the atom answers).
- Rhetorical flags (Presence of User Effects, D4).
- Entity relationships, sourcing, and links to primary sources.
- Temporal validity and update provenance.
- The coordination problem is substantial, as individual publishers cannot break the Intermediary Barrier alone, yet they cannot afford to wait for consensus. This tension dictates the future of publisher agency.
Scholium V (On Platform Incentives): The assumption that AIS prefer structured content (P10) is based on reliable outputs. If platforms prioritize keeping users within a walled garden, attribution becomes a liability for them, even if synthesis errors increase. This is why BC1 (Platform Alignment and Compliance) is the most critical external dependency.
Infinite thanks to Sannuta Raghu and David Caswell for their inspiring work, insight and patience. I am comforted we will face the apocalypse together. 🪳
