INDEX
Explanations
This neuron consistently activates on the two‐letter token "IA," i.e. the substring “ia” often appearing as the ending of words (e.g. “Columbia”).
New Auto-Interp
Negative Logits
ait
-0.07
Jahres
-0.06
Janet
-0.06
โรงแรม
-0.06
throwError
-0.06
xn
-0.06
beware
-0.06
zk
-0.06
แม
-0.06
Sur
-0.06
POSITIVE LOGITS
subdivisions
0.08
storage
0.08
λα
0.07
overcrow
0.07
avatars
0.07
ivos
0.07
hose
0.07
λά
0.06
fragments
0.06
ادم
0.06
Activations Density 0.000%