INDEX
Explanations
words denoting exclusivity or limitation
New Auto-Interp
Negative Logits
giuri
-0.78
ansatte
-0.67
publiques
-0.66
varandra
-0.64
MessageBoxIcon
-0.63
avviene
-0.61
giornal
-0.59
sanitaires
-0.59
medlems
-0.58
клопе
-0.58
POSITIVE LOGITS
头
0.72
########.
0.67
Offensive
0.63
Anterior
0.63
ographiques
0.62
頭
0.61
offensi
0.61
*{0.60
anterior
0.60
Incoming
0.59
Activations Density 0.146%