INDEX
Explanations
This neuron detects occurrences of the word “belt.”
New Auto-Interp
Negative Logits
Indiana
-0.07
Destroy
-0.07
consumers
-0.07
Christmas
-0.07
اینجا
-0.07
486
-0.07
_indx
-0.07
Pixar
-0.07
iana
-0.06
Cyan
-0.06
POSITIVE LOGITS
belt
0.13
Belt
0.12
belts
0.10
belt
0.09
eln
0.08
elt
0.07
負
0.07
带
0.07
llev
0.07
Bolt
0.07
Activations Density 0.006%