INDEX
Explanations
Coding discussions
This neuron responds to phrasing that makes recommendations or suggestions (e.g. modal auxiliaries like “would,” “should,” “could” indicating advice).
New Auto-Interp
Negative Logits
Exploration
-0.06
complains
-0.06
Pawn
-0.06
대학
-0.06
fraudulent
-0.06
xdc
-0.06
.constraints
-0.05
öğren
-0.05
schw
-0.05
Full
-0.05
POSITIVE LOGITS
�n
0.07
ريع
0.07
注册
0.07
ศ
0.06
میل
0.06
servisi
0.06
恢
0.06
Extend
0.06
ó
0.06
꾸
0.06
Activations Density 0.117%