INDEX
Explanations
This neuron activates on numeric tokens (digits and measurement figures), especially in physical‐stat measurements.
New Auto-Interp
Negative Logits
Porno
-0.07
'B
-0.07
newer
-0.06
Olympics
-0.06
dissoci
-0.06
attr
-0.06
dann
-0.06
soil
-0.06
scant
-0.06
έ
-0.06
POSITIVE LOGITS
.")]↵
0.07
何
0.07
gravel
0.06
Sym
0.06
vik
0.06
هدف
0.06
Stay
0.06
Txt
0.06
ISPs
0.06
originate
0.06
Activations Density 0.003%