INDEX
Negative Logits
which
-1.80
which
-1.41
которая
-0.97
Which
-0.93
Which
-0.90
WHICH
-0.87
которое
-0.83
который
-0.82
которые
-0.80
والتي
-0.79
POSITIVE LOGITS
purpoſe
1.16
Jefus
1.11
Efq
1.11
ſeveral
1.09
doubtnut
1.08
iſt
1.05
ſind
1.05
Monfieur
1.05
ſelf
1.04
myſelf
1.04
Activations Density 0.089%