INDEX
Explanations
instances of the word "to," particularly in contexts indicating a return or recommitment
New Auto-Interp
Negative Logits
ütün
-0.16
enties
-0.16
ibur
-0.15
loff
-0.15
anine
-0.14
zim
-0.14
hend
-0.14
outcome
-0.13
zw
-0.13
lleg
-0.13
POSITIVE LOGITS
basics
0.24
normal
0.23
sender
0.22
fold
0.20
roots
0.19
Sender
0.19
roots
0.19
normal
0.19
earth
0.19
æŃ£å¸¸
0.18
Activations Density 0.071%