INDEX
Explanations
negativity
The neuron fires on negative‐sentiment adjectives (e.g. “bad,” “worse,” “evil,” etc.).
New Auto-Interp
Negative Logits
adidas
-0.07
consort
-0.07
Clo
-0.06
Schn
-0.06
Thông
-0.06
etta
-0.06
주시
-0.06
errmsg
-0.06
ont
-0.06
GEST
-0.06
POSITIVE LOGITS
concluded
0.06
de
0.06
.Area
0.06
无码
0.06
instances
0.06
wb
0.06
=out
0.06
_literals
0.06
crollView
0.06
,↵↵↵↵
0.06
Activations Density 0.019%