Signal Curator – TruthField Set 001 dataset

Purpose

It’s a training set for resonance-based language modeling.

Where ordinary sentiment data rates “positive/negative,” this one rates coherence vs noise—how much a short text expresses clarity, compassion, or awareness rather than confusion or manipulation.

It lets a model learn to amplify clear signals and down-weight incoherent ones—the practical job of a Signal Curator.

Structure

Column —> Meaning

id —> Row number

text —> The short message or post (369 total)

resonance_score —> 0–1 float showing vibrational clarity: 0 = chaotic / distorted, 1 = clear / coherent

coherence_label —> coherent or incoherent—a categorical version of that score

tone_tag —> Dominant virtue or quality (clarity, compassion, truth, etc.)

stance —> Narrative posture: constructive, neutral, or harmful

notes —> Quick instruction for your pipeline – either amplify or filter_downweight

How it was built

• About 300 entries are “coherent”: short lines that promote awareness or balanced perception.

• About 69 entries are “incoherent”: reactive, misleading, or noisy messages.

• Each entry has a random tone_tag to show which quality it transmits.

• The text templates are written to sound like real micro-posts (so you can train or benchmark models that evaluate social content).

Use cases

1. Classifier fine-tuning

– Use text → coherence_label as supervised data.

– Helps an LLM or smaller model learn to detect truth/resonance.

2. Scoring function

– Use resonance_score as a target for regression or weighting in retrieval.

3. Filtering layer

– In a content pipeline, drop any row where notes = filter_downweight.

– Let “coherent” items propagate as high-trust signals.

4. Teaching example

– Demonstrates how to embed ethics + signal quality in data rather than after-the-fact moderation.

Conceptual takeaway

A Signal Curator doesn’t produce new data; they tune the field by deciding which words carry awareness cleanly and which distort it.

This CSV is now our first miniature “truth field.”

JSONL version – Each line is a complete JSON object containing the same fields:

id, text, resonance_score, coherence_label, tone_tag, stance, and notes.

Inner I Datasets on HuggingFace – https://huggingface.co/datasets/InnerI/Truthfield_set1

https://huggingface.co/datasets/InnerI/bittensor-nuance-scrape/embed/viewer/default/train?q=innerinetco&row=590

Leave a comment