INDEX
Explanations
The neuron strongly activates on numeric tokens (multi‐digit numbers) in the text.
New Auto-Interp
Negative Logits
Guerr
-0.06
身体
-0.06
лор
-0.06
könnte
-0.06
namely
-0.06
thinking
-0.06
Port
-0.06
“One
-0.06
BOTTOM
-0.06
Blood
-0.06
POSITIVE LOGITS
чоловік
0.07
sex
0.06
malign
0.06
izona
0.06
theft
0.06
allocated
0.06
sola
0.06
aren
0.06
]]=
0.06
)<=
0.06
Activations Density 0.028%