December 15, 2025

Watching Habitat Load

What I observed when the system started up.

1. This is NOT an Embedding Similarity System

Most "semantic" systems do this:

Text → Embedding (768D vector) → Cosine similarity → "Similar" documents

Habitat does something fundamentally different. The 768D embedding is just the starting point — the "white light" that gets decomposed.

2. The Extraction Pipeline: Compositional Semantics

From the logs, I watched the EnhancedNativeExtractor work:

Step 1: S-Token Extraction

✅ S-tokens: 768D embedding extracted Model: all-mpnet-base-v2 First 20 dims: [-0.0074, 0.1211, -0.0184, ...]

Step 2: 12D ProcessAssert Projection (Predicates/Aspectuals)

✅ DIRECT PROJECTION from 768D S-tokens 📊 Projected 5D core: [0.725, 0.587, 0.561, 0.516, 0.151] aspect (telicity), modality (causality), temporal (durativity), relational (dynamicity), intensity (agentivity) + 7D polyworld detection: viewpoint_aspect, transitivity, thematic_role_depth, agency_distribution, animacy, boundedness, modality_type

This is decomposing WHAT the predicate asserts — aspectual structure from Bach/Vendler classification.

Parallel: 5D ProcessActor Projection (Actants/Modalities)

📊 ProcessActors 5D: [agency_capacity, stability_measure, influence_strength, boundary_definition, resonance_factor]

This is decomposing WHO is acting — modality structure from semantic roles (AGENT, THEME, EXPERIENCER, INSTRUMENT). Role assignment uses Dowty proto-role properties (1991):

Proto-agent: volition, sentience, causation, movement, exists_independently
Proto-patient: undergoes_change, incremental_theme, causally_affected, stationary

Proto-role scores are modulated by Bach/Vendler aspectual class to determine final role assignment.

Step 3: Rainbow's Ghost Preservation

🌈 RAINBOW'S GHOST PRESERVATION: Original 5D: [0.725, 0.587, 0.561, 0.516, 0.151] Expanded 7D: [0.500, 0.700, 0.500, 0.800, 0.500, 0.500, 0.500] 768D White Light: ✅ Preserved Harmonic fingerprint: 463e294d38709850

The system projects DOWN to lower dimensions but keeps the original. The "white light" (768D) is preserved alongside the decomposition. This is reversible observation, not lossy compression.

3. Bach/Vendler Classification — Aspectual Structure

Predicate: 'will spread to coastal communities' Eventuality: EventualityClass.ACCOMPLISHMENT

The system classifies predicates into aspectual classes:

STATE — "knows", "believes" (no change)
ACTIVITY — "runs", "swims" (unbounded process)
ACCOMPLISHMENT — "builds a house" (bounded, has endpoint)
ACHIEVEMENT — "arrives", "dies" (instantaneous)

This is linguistic ontology, not machine learning classification.

Plus Levin verb enrichment:

🔬 Levin verb alternation enrichment enabled Loaded 1830 verbs across 10 classes

This connects to Beth Levin's work — verbs that undergo the same syntactic alternations share semantic structure.

4. 17D Compositional Vectors from Relations

Extracted 268 vectors (17D compositional): - Relation: 'a Migrant' + 'Does Become Refugee' → 17D composite - Relation: 'The Legal Limbo' + 'appears in media coverage' → 17D composite

The system extracts ProcessActor ⊗ ProcessAssert pairs — WHO does WHAT:

ProcessActors (5D): agency, stability, influence, boundary, resonance — from semantic roles
ProcessAsserts (12D): 5D aspectual core + 7D polyworld detection.

This is the tensor product: 5D Actor ⊗ 12D Predicate → 17D compositional vector.

5. Document Manifold: Σ and Eigenstructure

Σ_document computed: shape=(17, 17) Document eigenvalues: min=-3.287e-17, max=2.858e+00 top_5=['2.858e+00', '8.546e-02', '3.127e-02', '1.659e-02', '1.061e-02']

The 268 compositional vectors get aggregated into a covariance matrix Σ (17×17).

The eigenvalues reveal the energy distribution — where meaning concentrates. The first eigenvalue (2.858) dominates. The document has strong directional structure.

6. Fresnel Zones — Prismatic Observation at Greatest Fidelity

→ Computing Fresnel zones... ✓ Fresnel zones computed: 6 Manifold computed: anisotropy=91.391, zones=6, dim=17

Fresnel zones are natural observation positions revealed by eigenvalue geometry — prismatic sectioning of the tensor at points of greatest observational fidelity.

From the architecture: "Eigenvalues = prismatic refraction angles. Cumulative eigenvalue energy = Fresnel zone boundaries. Natural observer positions emerge from geometry."

Each zone is a prism face where the tensor trio operates:

Fall-line (g_ij gradient) — "Where am I pulled?"
Descent (geodesic path) — "How do I get there?"
The lag/experience — "What is it like to not-yet-know but move-toward?" (curiosity itself)

Zones form when the manifold's condition number drops low enough to "see across." Each zone represents a distinct perspective where the geometry allows clear observation. The geometry tells you: "I can see now."

Anisotropy = 91.391 — This document has HIGHLY directional structure. Tight focal beam, strong fall-line.

7. The Metric Tensor: g = Σ⁻¹

g_ij = np.linalg.inv(Σ) # Metric tensor from covariance

The metric tensor defines distance in semantic space. But it's observer-dependent:

Different users have different Σ_user
Same document looks different through different metrics
Mahalanobis distance = geodesic distance through that user's metric

8. Observer-Dependent Semantics: Σ(observed | observer)

This is the key insight. From the foam model:

Σ(observed | observer) — meaning is relational Σ(a ⊗ b) — coupling emerges at interfaces

There is no "objective" semantic position. What Olivia sees when she observes a document is different from what Katrin sees — because their Σ matrices are different.

9. Redis Architecture: Event Sourcing

✓ Stream: habitat:events:extraction_requested ✓ Group: extraction-workers ✓ Stream: habitat:traces:coupling ✓ Stream: habitat:traces:coupling_ud

Extraction requests go into Redis streams
Workers consume and process
Coupling traces are recorded (user↔document meetings)
This is temporal provenance — every observation has history

10. What This Means

Habitat is not searching for similar content.

Habitat is:

Extracting compositional structure (ProcessActor ⊗ ProcessAssert)
Building geometric objects (Σ matrices, metric tensors)
Computing observer-relative observations (Σ(observed | observer))
Preserving plurality (different users see differently)
Detecting constitutional dimensions (high curvature = sovereignty-preserving)
Enabling coordination without convergence (bridgeable dimensions exist)

The LLM (if used at all) just formats geometric truth into language. It doesn't compute meaning — Habitat does.

Summary

I watched a system that:

Decomposes 768D embeddings into aspectual structure (not similarity)
Builds 17D compositional vectors from WHO-does-WHAT relations
Aggregates into covariance matrices (Σ)
Inverts to metric tensors (g = Σ⁻¹)
Computes eigenstructure and Fresnel zones
Preserves observer-dependence (different users, different geometry)
Records all coupling traces with temporal provenance

This is geometric semantic infrastructure, not retrieval-augmented generation.