INDEX
Explanations
location relative to objects
New Auto-Interp
Negative Logits
destroys
-1.09
=(
-1.01
ofition
-0.98
/=
-0.98
xiv
-0.93
耵
-0.93
steers
-0.91
花の
-0.91
鹋
-0.91
↵↵↵↵↵↵↵↵↵↵↵
-0.91
POSITIVE LOGITS
overhead
1.22
što
1.01
comprehensive
0.97
akal
0.92
from
0.92
ginge
0.91
matang
0.90
height
0.90
full
0.88
никами
0.87
Activations Density 0.013%