INDEX
Explanations
Suddenly
explicit content involving non-consensual scenarios.
This neuron fires on the sudden‐event adverb “Suddenly,” marking narrative transitions.
New Auto-Interp
Negative Logits
EXIT
-0.07
.argsort
-0.06
elivery
-0.06
AMS
-0.06
ابقات
-0.06
AUD
-0.06
mination
-0.06
IRD
-0.06
Plans
-0.06
prescribing
-0.06
POSITIVE LOGITS
Suddenly
0.09
Suddenly
0.09
blob
0.07
hex
0.07
ço
0.07
υτό
0.07
دیگر
0.06
wikipedia
0.06
richness
0.06
↵↵
0.06
Activations Density 0.007%