INDEX
Explanations
proper nouns or names
references to political figures and actions
New Auto-Interp
Negative Logits
\">
-0.83
Bang
-0.81
è¦ļéĨĴ
-0.78
ORPG
-0.75
maps
-0.73
forestation
-0.73
piracy
-0.73
ãĥ¥
-0.73
ãĤ¨ãĥ«
-0.72
ãĥĵ
-0.72
POSITIVE LOGITS
disingen
1.21
dishon
1.07
perjury
1.05
condesc
1.03
dishonest
1.03
rhetorical
1.02
rebuke
1.02
coward
1.02
repud
1.01
hypocrisy
1.00
Activations Density 0.955%