INDEX
Explanations
negative statements and political figures
New Auto-Interp
Negative Logits
iple
-0.78
utor
-0.68
DragonMagazine
-0.67
ãĥĥãĥĪ
-0.67
Intake
-0.66
ILCS
-0.65
SERV
-0.64
ISE
-0.62
prest
-0.60
guiActiveUn
-0.59
POSITIVE LOGITS
noting
1.35
saying
1.29
stating
1.28
citing
1.13
insisting
1.07
mentioning
1.00
arguing
0.98
declaring
0.98
claiming
0.98
stressing
0.96
Activations Density 0.419%