INDEX
Explanations
phrases indicating a movement or transition toward something
New Auto-Interp
Negative Logits
enfans
-0.75
שוליים
-0.74
majánló
-0.73
ainfi
-0.70
ußt
-0.70
šal
-0.68
juges
-0.68
SEGUIR
-0.67
argint
-0.66
journalistes
-0.65
POSITIVE LOGITS
into
1.55
INTO
1.38
into
1.36
Into
1.33
Into
1.27
INTO
1.04
onto
0.84
nto
0.59
inta
0.59
onto
0.58
Activations Density 0.153%