INDEX
Explanations
instances of characters "staring" or "looking" at something, indicating attention or observation
New Auto-Interp
Negative Logits
uiltin
-0.16
upil
-0.16
alim
-0.15
ubat
-0.15
ural
-0.15
upert
-0.14
isel
-0.14
å¹¹ç·ļ
-0.14
ader
-0.14
ãĥ¼ãĤ¯
-0.14
POSITIVE LOGITS
toward
0.22
towards
0.20
onto
0.18
Tow
0.17
into
0.16
Towards
0.15
onto
0.15
upon
0.15
hacia
0.15
down
0.15
Activations Density 0.091%