INDEX
Negative Logits
even
-1.08
especially
-1.07
not
-1.04
vicin
-1.03
first
-0.98
huwa
-0.97
secretly
-0.96
only
-0.95
配慮
-0.91
how
-0.91
POSITIVE LOGITS
❯
1.03
trám
0.91
Similar
0.91
—¡
0.88
Auflösung
0.87
Estas
0.86
két
0.85
lý
0.85
Posteriormente
0.85
asily
0.83
Activations Density 0.004%