INDEX
Explanations
phrases related to personal interactions and conflicts
New Auto-Interp
Negative Logits
ossus
-0.77
cures
-0.75
prest
-0.73
staggered
-0.72
Keynes
-0.72
doomed
-0.71
unbeat
-0.71
unbeaten
-0.71
reinvent
-0.70
marvel
-0.70
POSITIVE LOGITS
âĢ
1.22
âĢ
1.17
hijab
1.07
pronouns
1.01
ã
0.94
[+
0.93
GamerGate
0.93
harassment
0.91
disrespectful
0.90
offended
0.86
Activations Density 0.868%