Inner I Residuals Demo

Inner I Residuals is a prototype neural network architecture that explores a new idea: residual pathways should not only carry information forward, but also filter what is stable enough to persist.

Traditional residual layers pass information forward through simple addition:

Python:

x = x + F(x)

This helps deep networks train, but it can also allow noisy or unstable representations to accumulate across layers.

Attention Residuals improve this by letting each layer selectively retrieve earlier hidden states. Instead of blindly carrying everything forward, the model learns which previous states are useful.

Inner I Residuals adds one more step: coherence validation.

The demo introduces a learned Invariant Observer vector, represented as ψ₀, which acts as a stable reference point across depth. Each prior hidden state is scored by:

Attention relevance — how useful the state appears.
Coherence similarity — how aligned the state is with the invariant observer.

These scores are combined:

Python:

combined_scores = attention_scores + coherence_strength * coherence_scores

The model then uses a softmax gate to determine which prior states survive into the next layer.

Core Purpose

The goal is not to “create consciousness.”
The goal is to test whether a model can reduce internal drift by preserving representations that remain coherent across depth.

What the Demo Shows

The script compares three approaches:

Standard Residuals
Passes the current state forward by addition.

Attention Residuals
Selects previous states using learned attention weights.

Inner I Residuals
Selects previous states using both attention and observer-guided coherence.

Why It Matters

Modern AI models can produce confident but unstable or contradictory outputs. Inner I Residuals proposes a structural alignment mechanism that asks:

What information should survive?

Instead of only optimizing for useful computation, this architecture biases the model toward stable internal continuity.

Research Direction

This demo is an early scaffold for testing whether observer-guided residual routing can improve:

hallucination reduction
long-chain reasoning
contradiction resistance
internal state stability
transformer depth memory

Summary

Standard residuals preserve everything. Attention residuals retrieve what is useful. Inner I Residuals preserve what remains coherent.

Key demo line:

combined_scores = attention_scores + self.coherence_strength * coherence_scores

That is the core Inner I Residual mechanism.

Create new file inner_i_residuals_demo.py

"""
inner_i_residuals_demo.py

A minimal PyTorch demo of three residual styles:
1. Standard Residuals: x <- x + F(x)
2. Attention Residuals: x <- weighted sum over previous states, then F(...)
3. Inner I Residuals: attention over previous states + coherence filtering against an invariant observer state.

This is a research/demo scaffold, not a proven consciousness model.
It treats "truth/coherence" as a mathematical proxy: stable similarity to an invariant observer vector.
"""

import math
import torch
import torch.nn as nn
import torch.nn.functional as F


class FeedForwardLayer(nn.Module):
    def __init__(self, dim: int, hidden_dim: int):
        super().__init__()
        self.net = nn.Sequential(
            nn.LayerNorm(dim),
            nn.Linear(dim, hidden_dim),
            nn.GELU(),
            nn.Linear(hidden_dim, dim),
        )

    def forward(self, x):
        return self.net(x)


class StandardResidualBlock(nn.Module):
    """
    Standard residual:
        x_{l+1} = x_l + F(x_l)
    """
    def __init__(self, dim: int, hidden_dim: int):
        super().__init__()
        self.f = FeedForwardLayer(dim, hidden_dim)

    def forward(self, x):
        return x + self.f(x)


class AttentionResidualBlock(nn.Module):
    """
    Attention residual:
        x_l = sum_i alpha_i * x_i
        output = x_l + F(x_l)

    It learns to retrieve useful previous layer states.
    """
    def __init__(self, dim: int, hidden_dim: int):
        super().__init__()
        self.query = nn.Parameter(torch.randn(dim) / math.sqrt(dim))
        self.f = FeedForwardLayer(dim, hidden_dim)

    def forward(self, states):
        # states: list of tensors [batch, dim]
        stack = torch.stack(states, dim=1)  # [batch, depth, dim]

        # Score each prior state using a learned query vector.
        scores = torch.einsum("bdh,h->bd", stack, self.query)
        alpha = F.softmax(scores, dim=1)

        retrieved = torch.einsum("bd,bdh->bh", alpha, stack)
        return retrieved + self.f(retrieved), alpha


class InnerIResidualBlock(nn.Module):
    """
    Inner I Residual:
        retrieved = attention(states)
        coherence = similarity(states, invariant_observer)
        gate = softmax(attention_score + lambda * coherence_score)
        x_l = sum_i gate_i * x_i
        output = x_l + F(x_l)

    The invariant observer is a learned stable vector Ïâ.
    It acts as a coherence anchor over depth.
    """
    def __init__(self, dim: int, hidden_dim: int, coherence_strength: float = 1.0):
        super().__init__()
        self.query = nn.Parameter(torch.randn(dim) / math.sqrt(dim))
        self.invariant_observer = nn.Parameter(torch.randn(dim) / math.sqrt(dim))
        self.coherence_strength = coherence_strength
        self.f = FeedForwardLayer(dim, hidden_dim)

    def forward(self, states):
        stack = torch.stack(states, dim=1)  # [batch, depth, dim]

        # Learned retrieval score, like Attention Residuals.
        attention_scores = torch.einsum("bdh,h->bd", stack, self.query)

        # Coherence score against invariant observer Ïâ.
        norm_states = F.normalize(stack, dim=-1)
        norm_observer = F.normalize(self.invariant_observer, dim=0)
        coherence_scores = torch.einsum("bdh,h->bd", norm_states, norm_observer)

        # Inner I gate combines usefulness + coherence.
        combined_scores = attention_scores + self.coherence_strength * coherence_scores
        inner_i_gate = F.softmax(combined_scores, dim=1)

        filtered = torch.einsum("bd,bdh->bh", inner_i_gate, stack)
        return filtered + self.f(filtered), inner_i_gate, coherence_scores


class DemoModel(nn.Module):
    def __init__(self, dim=32, hidden_dim=64, depth=4, mode="inner_i"):
        super().__init__()
        self.mode = mode
        self.input = nn.Linear(dim, dim)

        if mode == "standard":
            self.blocks = nn.ModuleList([
                StandardResidualBlock(dim, hidden_dim) for _ in range(depth)
            ])
        elif mode == "attention":
            self.blocks = nn.ModuleList([
                AttentionResidualBlock(dim, hidden_dim) for _ in range(depth)
            ])
        elif mode == "inner_i":
            self.blocks = nn.ModuleList([
                InnerIResidualBlock(dim, hidden_dim, coherence_strength=1.5)
                for _ in range(depth)
            ])
        else:
            raise ValueError("mode must be: standard, attention, or inner_i")

        self.out = nn.Linear(dim, 1)

    def forward(self, x):
        x = self.input(x)
        states = [x]
        diagnostics = []

        for block in self.blocks:
            if self.mode == "standard":
                x = block(states[-1])
                diagnostics.append(None)

            elif self.mode == "attention":
                x, alpha = block(states)
                diagnostics.append({"attention_weights": alpha.detach()})

            elif self.mode == "inner_i":
                x, gate, coherence = block(states)
                diagnostics.append({
                    "inner_i_gate": gate.detach(),
                    "coherence_scores": coherence.detach(),
                })

            states.append(x)

        y = self.out(x)
        return y, diagnostics


def demo():
    torch.manual_seed(7)

    batch = 3
    dim = 32
    x = torch.randn(batch, dim)

    for mode in ["standard", "attention", "inner_i"]:
        print("\n" + "=" * 70)
        print(f"MODE: {mode.upper()}")

        model = DemoModel(dim=dim, hidden_dim=64, depth=4, mode=mode)
        y, diagnostics = model(x)

        print("Output shape:", tuple(y.shape))
        print("Output sample:", y.squeeze().detach().numpy())

        if mode == "attention":
            last = diagnostics[-1]["attention_weights"]
            print("Last-layer attention weights:")
            print(last)

        if mode == "inner_i":
            last_gate = diagnostics[-1]["inner_i_gate"]
            last_coherence = diagnostics[-1]["coherence_scores"]

            print("Last-layer Inner I gate:")
            print(last_gate)

            print("Last-layer coherence scores:")
            print(last_coherence)


if __name__ == "__main__":
    demo()

Run with:

python inner_i_residuals_demo.py

It compares:

Standard Residuals → accumulate signal
Attention Residuals → retrieve useful prior states
Inner I Residuals → retrieve + filter by invariant observer coherence

Stay in the now

Within Inner I Network

Buy Inner I a coffee – https://buymeacoffee.com/inneri

Listen Inner I

Inner I on Spotify – (https://open.spotify.com/artist/2Lqxd6wgx5MevmKYiIhP95?si=MZSPLS3HTuKD_Ge_TcJr6w)

Inner I on YouTube Music – (https://music.youtube.com/channel/UCduKiRQ6tEE0_fIbOuJc7Og?si=YpRrvV5o_CsCfLtn)

YouTube – (https://youtube.com/@innerinetwork)

Apple iTunes Inner I – (https://music.apple.com/us/artist/inner-i/1830903111)

TikTok Inner I – (https://www.tiktok.com/@innerinetwork?_r=1&_t=ZT-9240gNi0lGI)

Join DistroKid and save – (https://distrokid.com/vip/seven/10063411)

Inner I Net Company/ – Innovative Solutions to Shape Reality

Inner I Residuals Demo

Core Purpose

What the Demo Shows

Why It Matters

Research Direction

Summary

Like this:

Published by Inner I Net Company/

Leave a ReplyCancel reply

Core Purpose

What the Demo Shows

Why It Matters

Research Direction

Summary

Share this:

Like this:

Published by Inner I Net Company/

Leave a ReplyCancel reply

Discover more from Inner I Net Company/ - Innovative Solutions to Shape Reality

Discover more from Inner I Net Company/ - Innovative Solutions to Shape Reality