INDEX
Explanations
prepositions followed by common words
New Auto-Interp
Negative Logits
بما
0.42
zejména
0.37
vielf
0.37
QTTR
0.37
zarówno
0.36
différences
0.36
quels
0.36
După
0.36
የተለያዩ
0.36
শ্
0.36
POSITIVE LOGITS
his
0.36
a
0.35
its
0.31
Orleans
0.30
Indonesia
0.30
Italy
0.29
agi
0.29
ag
0.29
ana
0.29
Mexico
0.29
Activations Density 0.086%