INDEX
Explanations
questions
This neuron detects question-introducing words and phrases (e.g. “how,” “can,” “we,” etc.) that kick off interrogative sentences.
New Auto-Interp
Negative Logits
trust
-0.06
Phi
-0.06
cottage
-0.06
impact
-0.06
prevent
-0.06
�
-0.06
script
-0.06
функци
-0.06
TYPE
-0.06
icont
-0.06
POSITIVE LOGITS
老师
0.07
чоловік
0.07
-ranking
0.07
ские
0.07
imo
0.06
енность
0.06
oralType
0.06
tuner
0.06
舉
0.06
(se
0.06
Activations Density 0.052%