INDEX
Explanations
phrases indicating actions or feelings directed at others
toward and similar prepositions
New Auto-Interp
Negative Logits
iastes
-0.30
Entsche
-0.30
especias
-0.30
uș
-0.28
([^
-0.28
laim
-0.28
Boolean
-0.28
lös
-0.27
Eſ
-0.27
canst
-0.27
POSITIVE LOGITS
envers
0.92
égard
0.84
AndEndTag
0.75
gegenüber
0.70
confronti
0.69
noDo
0.66
égard
0.64
Toward
0.62
vů
0.62
toward
0.62
Activations Density 0.033%