INDEX
Explanations
non-english text
The neuron lights up on German content words—mostly nouns (often capitalized) used in urban-planning/social-control contexts.
New Auto-Interp
Negative Logits
fluctuations
-0.06
KING
-0.06
Max
-0.06
as
-0.06
ions
-0.06
organis
-0.06
.Mvc
-0.06
orado
-0.06
Rich
-0.06
ExecutionContext
-0.06
POSITIVE LOGITS
:/
0.07
━━━━━━━━━━━━━━━━
0.07
(other
0.07
_generated
0.06
soci
0.06
приблиз
0.06
ch
0.06
0.06
pseud
0.06
seud
0.06
Activations Density 0.067%