INDEX
Explanations
The neuron detects first‐person opinion or hedging phrases (e.g. “I think,” “I feel,” “I’m afraid,” etc.) that introduce subjective commentary.
New Auto-Interp
Negative Logits
离
-0.07
以后
-0.07
addon
-0.07
720
-0.07
notas
-0.07
沟
-0.07
55
-0.07
seller
-0.06
*/),
-0.06
مثل
-0.06
POSITIVE LOGITS
↵
0.08
Seems
0.08
think
0.08
[channel
0.07
↵ ↵
0.07
[c
0.07
↵
0.07
시행
0.07
burns
0.07
Rahmen
0.07
Activations Density 0.033%