INDEX
Explanations
This neuron activates on the word “Yes” at the start of affirmative assistant replies.
New Auto-Interp
Negative Logits
�
-0.07
_apply
-0.07
USHORT
-0.07
076
-0.07
приблиз
-0.06
recur
-0.06
.stat
-0.06
追
-0.06
stalk
-0.06
erro
-0.06
POSITIVE LOGITS
Malaysian
0.06
تمامی
0.06
;
0.06
*****
0.06
Routing
0.06
({"0.06
Accessories
0.06
Yine
0.06
_SS
0.06
Vik
0.06
Activations Density 0.020%