INDEX
Explanations
highly polarized and emotional statements
phrases with repeated symbols or unusual characters
New Auto-Interp
Negative Logits
Panc
-0.68
entary
-0.66
airs
-0.65
mares
-0.65
pageant
-0.63
Doodle
-0.62
iewicz
-0.62
Alive
-0.62
Ascension
-0.62
Belg
-0.62
POSITIVE LOGITS
ª
1.31
ł
1.12
IJ
1.12
¹
1.11
«
1.10
ij
1.09
¤
1.09
¡
1.06
¦
1.03
®
1.00
Activations Density 0.145%