INDEX
Explanations
Twitter handles and online usernames
Twitter handles and references to specific names or entities
New Auto-Interp
Negative Logits
Þ
-0.99
ñ
-0.92
ò
-0.90
ß
-0.83
pione
-0.83
oun
-0.82
eleph
-0.80
conflic
-0.79
subur
-0.79
awa
-0.78
POSITIVE LOGITS
t
1.16
tty
1.04
ts
1.03
tt
0.98
td
0.92
tis
0.91
dt
0.91
plet
0.91
tp
0.91
ton
0.90
Activations Density 0.211%