INDEX
Explanations
the preposition "from" in various contexts
New Auto-Interp
Negative Logits
oster
-0.15
ica
-0.15
ault
-0.15
Ñĩе
-0.14
esper
-0.14
gaben
-0.14
_HW
-0.13
antine
-0.13
deaux
-0.13
viron
-0.13
POSITIVE LOGITS
simple
0.19
esser
0.16
humble
0.15
to
0.15
endor
0.15
anger
0.14
olume
0.14
ebek
0.14
ENDOR
0.14
simple
0.14
Activations Density 0.032%