INDEX
Explanations
Word fragments and language
The neuron responds to terms and word fragments that denote political or ideological concepts (e.g. “ideology,” “neo-cons,” “post-truth,” etc.).
New Auto-Interp
Negative Logits
yyyy
-0.07
molest
-0.07
uesto
-0.06
synonym
-0.06
віднос
-0.06
salute
-0.06
."_
-0.06
uplat
-0.06
責
-0.06
££
-0.06
POSITIVE LOGITS
*****↵
0.06
누구
0.06
MESSAGE
0.06
REFERRED
0.06
_CHAR
0.06
Further
0.06
.INVALID
0.06
FString
0.06
-rays
0.06
Basically
0.06
Activations Density 0.033%