INDEX

Explanations

completeness, automatic, relevant, convex, friendly

This neuron detects standalone classification or alert labels indicating formal status or severity (e.g. “Near Threatened,” “Excessive,” “Unhealthy,” “Slight,” “classé”).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

rolet

0.75

arbeiten

0.66

ammlung

0.64

apro

0.64

openai

0.64

を中心

0.63

ineri

0.63

 онлайн

0.61

inser

0.61

वर्

0.61

POSITIVE LOGITS

ness

0.78

 status

0.69

 poziomie

0.64

 Status

0.61

ity

0.60

 STATUS

0.59

หรือไม่

0.59

 threshold

0.58

 artinya

0.58

 Threshold

0.58

Activations Density 0.426%