INDEX
Explanations
phrases or expressions that indicate contrasting ideas or perspectives
New Auto-Interp
Negative Logits
eview
-0.16
лини
-0.15
.opend
-0.15
磨
-0.15
<KeyValuePair
-0.14
kou
-0.14
/render
-0.14
Sou
-0.14
181
-0.14
Ñij
-0.13
POSITIVE LOGITS
hand
0.44
hand
0.35
Hand
0.33
_hand
0.28
Hand
0.28
.hand
0.25
contrary
0.24
one
0.24
flip
0.24
HAND
0.23
Activations Density 0.026%