INDEX
Explanations
Speaking out/expressing opinions
The neuron activates on tokens in phrases like “spoke out” or “spoken out,” i.e. when the text describes someone publicly speaking up or criticizing.
New Auto-Interp
Negative Logits
εφαρ
-0.06
Task
-0.06
_ASS
-0.06
费用
-0.06
_gshared
-0.06
]['
-0.06
snprintf
-0.06
Gro
-0.06
_META
-0.06
_Generic
-0.06
POSITIVE LOGITS
outspoken
0.11
.twimg
0.07
λίγ
0.07
.xrLabel
0.07
�
0.07
responsive
0.06
Rotterdam
0.06
u
0.06
批
0.06
Fisher
0.06
Activations Density 0.006%