INDEX
Explanations
references to groups or collective pronouns
New Auto-Interp
Negative Logits
GOTREF
-0.75
}}"></
-0.72
-0.72
tvguidetime
-0.72
)))));
-0.69
ագրություններ
-0.64
invokingState
-0.63
DoubleQuotes
-0.63
nocześnie
-0.63
smtplib
-0.62
POSITIVE LOGITS
They
1.03
they
1.01
they
0.97
They
0.88
THEY
0.79
THEY
0.68
เค้า
0.63
Theſe
0.56
akik
0.56
theyre
0.56
Activations Density 0.139%