INDEX
Explanations
phrases and concepts related to direction and movement, particularly in the context of making progress or changes
New Auto-Interp
Negative Logits
auce
-0.16
loth
-0.15
yš
-0.14
λλη
-0.13
ortal
-0.13
åłĤ
-0.13
agle
-0.13
fait
-0.13
ORMAL
-0.13
Dah
-0.12
POSITIVE LOGITS
direction
1.43
directions
1.23
direction
1.20
Direction
1.17
æĸ¹åIJij
1.09
Direction
1.08
Directions
1.03
-direction
1.00
_direction
0.94
Directions
0.92
Activations Density 0.351%