INDEX
Negative Logits
�
-0.07
usage
-0.07
bef
-0.07
rival
-0.07
Hasan
-0.06
valid
-0.06
.va
-0.06
very
-0.06
bef
-0.06
worst
-0.06
POSITIVE LOGITS
through
0.16
Through
0.15
through
0.13
Through
0.13
THROUGH
0.12
thru
0.10
_through
0.09
-through
0.09
durch
0.08
attravers
0.08
Activations Density 0.056%