INDEX
Explanations
common denominators
The neuron fires on numeric tokens (multi‐digit numbers) in the text.
New Auto-Interp
Negative Logits
'-'
-0.07
.mon
-0.07
tep
-0.06
INUE
-0.06
puted
-0.06
pués
-0.06
eben
-0.06
Nero
-0.06
_Show
-0.06
خشک
-0.06
POSITIVE LOGITS
Marlins
0.06
societies
0.06
<div
0.06
senator
0.06
continent
0.06
şehir
0.06
$content
0.06
Latinos
0.06
whites
0.06
/pl
0.06
Activations Density 0.002%