INDEX
Explanations
political and governmental terms and phrases
references to opinions and claims made by individuals or authorities
New Auto-Interp
Negative Logits
ãĤ´ãĥ³
-0.76
respectively
-0.59
surprisingly
-0.59
anwhile
-0.56
arthed
-0.55
xtap
-0.52
described
-0.51
translation
-0.50
quished
-0.50
arnaev
-0.50
POSITIVE LOGITS
..."
1.49
â̦"
1.40
%"
1.35
,"
1.31
)",
1.30
)"
1.29
.")
1.28
),"
1.27
..."
1.23
,'"
1.23
Activations Density 2.005%