INDEX
Explanations
This neuron never activates—it doesn’t respond to any tokens.
New Auto-Interp
Negative Logits
Templates
-0.07
reorder
-0.07
lesen
-0.07
stva
-0.07
itudes
-0.06
Collapse
-0.06
ptune
-0.06
-summary
-0.06
POCH
-0.06
reserve
-0.06
POSITIVE LOGITS
Gab
0.06
_flags
0.06
↵ ↵
0.06
سات
0.06
↵ ↵
0.05
↵
0.05
hai
0.05
↵
0.05
Vib
0.05
Flem
0.05
Activations Density 0.092%