INDEX
Explanations
injustice
The neuron activates on words referring to broad social groups or the general populace (e.g. “people,” “everyone,” “citizen,” “public”).
New Auto-Interp
Negative Logits
115
-0.07
16
-0.07
BitConverter
-0.07
convers
-0.06
iga
-0.06
swung
-0.06
blog
-0.06
())↵
-0.06
coleg
-0.06
SID
-0.06
POSITIVE LOGITS
}$/
0.07
iCloud
0.06
narrowing
0.06
neut
0.06
吉
0.06
/de
0.06
�
0.06
Return
0.06
варто
0.06
kın
0.06
Activations Density 0.152%