INDEX
Explanations
political statements or criticisms
elements related to political discourse and criticism
New Auto-Interp
Negative Logits
ãĥĥãĥĪ
-0.68
DragonMagazine
-0.66
prest
-0.65
utor
-0.65
Intake
-0.64
iple
-0.63
SERV
-0.62
INT
-0.62
ILCS
-0.61
Purchase
-0.60
POSITIVE LOGITS
noting
1.34
saying
1.32
stating
1.29
citing
1.12
insisting
1.11
arguing
0.99
claiming
0.99
stressing
0.97
pointing
0.94
remark
0.94
Activations Density 0.587%