INDEX
Explanations
This neuron never activates—it does not detect any pattern.
New Auto-Interp
Negative Logits
ManyToOne
-0.07
player
-0.07
Zheng
-0.07
pathlib
-0.06
голов
-0.06
venir
-0.06
інки
-0.06
terr
-0.06
-minded
-0.06
Muslims
-0.06
POSITIVE LOGITS
أق
0.07
atisfied
0.07
training
0.07
neon
0.06
تش
0.06
indy
0.06
[:,
0.06
τικ
0.06
_ad
0.06
rina
0.06
Activations Density 0.005%