INDEX
Explanations
directions and words related to movement
New Auto-Interp
Negative Logits
さみ
-0.38
INVENTION
-0.38
oarece
-0.37
CIC
-0.37
leo
-0.37
crapers
-0.36
讼
-0.36
idigung
-0.36
yves
-0.36
aronder
-0.36
POSITIVE LOGITS
down
0.63
DOWN
0.55
thither
0.50
Down
0.50
here
0.50
متعلقه
0.50
acá
0.49
للاسماء
0.49
aloft
0.47
собі
0.46
Activations Density 0.357%