INDEX
Explanations
phrases indicating proximity or distance
New Auto-Interp
Negative Logits
immel
-0.15
halt
-0.14
_WM
-0.14
ÏģÏİ
-0.14
rink
-0.13
amon
-0.13
ķìĿ¸
-0.13
unto
-0.13
ught
-0.13
orses
-0.13
POSITIVE LOGITS
reach
0.57
Reach
0.40
Reach
0.39
reach
0.38
range
0.38
sight
0.37
reaches
0.35
reached
0.33
reaching
0.31
reachable
0.31
Activations Density 0.056%