INDEX
Negative Logits
despite
-1.20
for
-1.16
because
-1.13
there
-1.05
not
-1.04
their
-1.03
get
-1.00
or
-0.98
new
-0.96
Taking
-0.96
POSITIVE LOGITS
meisten
1.05
sooo
0.94
selben
0.94
BLA
0.88
molti
0.88
鹬
0.87
creș
0.85
ayur
0.85
soooo
0.85
wonderfully
0.85
Activations Density 0.001%