INDEX
Explanations
prepositions and conjunctions
New Auto-Interp
Negative Logits
(
1.54
(
1.05
((
0.99
/
0.93
هم
0.91
(
0.91
(_,
0.91
ها
0.90
(,
0.89
ла
0.86
POSITIVE LOGITS
along
1.49
which
1.45
alongside
1.35
oraz
1.34
throughout
1.26
which
1.24
spearheaded
1.22
during
1.21
która
1.21
amidst
1.21
Activations Density 0.454%