INDEX
Explanations
prepositions followed by common words
New Auto-Interp
Negative Logits
if
-2.30
after
-1.95
was
-1.86
two
-1.70
is
-1.70
an
-1.61
one
-1.57
because
-1.57
has
-1.51
having
-1.48
POSITIVE LOGITS
této
1.72
也都
1.67
sont
1.66
妡
1.64
だけではなく
1.62
archiwizowane
1.60
boister
1.59
presumably
1.56
are
1.55
llavero
1.52
Activations Density 0.336%