INDEX
Explanations
discussions about societal issues and critiques of political correctness
New Auto-Interp
Negative Logits
iais
-0.16
iei
-0.15
EntityState
-0.15
ıi
-0.15
iag
-0.14
егоÑĢ
-0.14
åĢĴ
-0.13
_THROW
-0.13
zdy
-0.13
iae
-0.13
POSITIVE LOGITS
inson
0.16
apy
0.14
plex
0.14
ermann
0.13
Kelley
0.13
uong
0.13
bac
0.13
Bray
0.13
lest
0.13
spiracy
0.13
Activations Density 0.057%