INDEX
Explanations
phrases indicating location and direction
New Auto-Interp
Negative Logits
Huyá»ĩn
-0.15
Zuk
-0.14
grily
-0.14
klad
-0.14
fov
-0.13
bler
-0.13
Äįan
-0.13
.Scheme
-0.13
uw
-0.13
ungan
-0.13
POSITIVE LOGITS
left
0.91
right
0.90
left
0.75
Left
0.73
Right
0.71
right
0.71
å·¦
0.70
LEFT
0.69
RIGHT
0.69
Left
0.68
Activations Density 0.201%