INDEX
Explanations
This neuron activates on capitalized proper nouns (e.g. names of people, brands, and other named entities).
New Auto-Interp
Negative Logits
_Q
-0.07
پیچ
-0.06
.prop
-0.06
Crack
-0.06
ynth
-0.06
ame
-0.06
Triangles
-0.06
recreational
-0.06
affid
-0.06
Needle
-0.06
POSITIVE LOGITS
firmy
0.07
IFI
0.07
شر
0.07
^[
0.06
,out
0.06
gouver
0.06
身
0.06
dared
0.06
STREET
0.06
assez
0.06
Activations Density 0.158%