INDEX
Explanations
The neuron selectively activates on the word “Get” (as in code comments/instruction lines).
New Auto-Interp
Negative Logits
sont
-0.07
Honduras
-0.07
_SEC
-0.06
قوان
-0.06
mattresses
-0.06
Doming
-0.06
也
-0.06
продукт
-0.06
Encoding
-0.06
닥
-0.06
POSITIVE LOGITS
Getty
0.07
CID
0.07
GK
0.07
Celebrity
0.07
./
0.06
queda
0.06
:'',↵
0.06
ullan
0.06
‘s
0.06
elcome
0.06
Activations Density 0.029%