INDEX
Explanations
This neuron activates on numeric tokens, i.e. sequences of digits.
New Auto-Interp
Negative Logits
Thomas
-0.06
Všech
-0.06
counter
-0.06
网
-0.06
Anal
-0.06
weekend
-0.06
full
-0.06
fields
-0.06
сий
-0.06
ーの
-0.06
POSITIVE LOGITS
ecute
0.07
fearing
0.06
-lined
0.06
✔
0.06
приход
0.06
comet
0.06
certified
0.06
???
0.06
Trainer
0.06
nek
0.06
Activations Density 0.003%