INDEX
Explanations
appropriate/relevant
This neuron primarily recognizes instances of the word “appropriate.”
New Auto-Interp
Negative Logits
RSA
-0.07
-tests
-0.07
Faculty
-0.06
_four
-0.06
遊
-0.06
Indoor
-0.06
list
-0.06
ethe
-0.06
directors
-0.06
.multiply
-0.06
POSITIVE LOGITS
disillusion
0.07
devastating
0.07
alem
0.07
intimidating
0.06
tears
0.06
otp
0.06
.scrollView
0.06
بم
0.06
sockopt
0.06
druhou
0.06
Activations Density 0.022%