INDEX
Negative Logits
t
1.67
is
1.23
i
1.00
u
0.93
reicht
0.91
ب
0.89
נ
0.89
سين
0.87
the
0.86
ות
0.84
POSITIVE LOGITS
Inn
1.26
inns
1.26
inn
1.23
ri
1.05
Inn
1.05
ra
1.02
olls
1.01
inn
1.00
in
0.95
ates
0.93
Activations Density 0.002%
t
is
i
u
reicht
ب
נ
سين
the
ות
Inn
inns
inn
ri
Inn
ra
olls
inn
in
ates