INDEX
Explanations
This neuron detects the second-person pronoun “you,” i.e. direct address to the user.
New Auto-Interp
Negative Logits
Scots
-0.07
base
-0.07
Mos
-0.06
blog
-0.06
speech
-0.06
support
-0.06
Sarah
-0.06
поход
-0.06
Ν
-0.06
electronic
-0.06
POSITIVE LOGITS
attrib
0.07
еви
0.07
แล
0.06
.AllowUser
0.06
='.
0.06
\Action
0.06
려
0.06
학년도
0.06
"),"
0.06
ファ
0.06
Activations Density 0.017%