INDEX
Explanations
sentiments and discussions related to political hypocrisy and societal issues
New Auto-Interp
Negative Logits
#
-0.16
ä¸Ģå®ļ
-0.15
ilian
-0.15
ancia
-0.15
uida
-0.15
STRICT
-0.14
amax
-0.14
HOWEVER
-0.14
ederland
-0.14
ancias
-0.14
POSITIVE LOGITS
remotely
0.24
actually
0.24
actual
0.23
basic
0.21
actually
0.20
inconvenient
0.20
bothered
0.19
substance
0.19
any
0.19
intelligent
0.18
Activations Density 0.357%