INDEX
Explanations
instances of expressions indicating hesitation or uncertainty
New Auto-Interp
Negative Logits
uff
-0.16
,readonly
-0.15
yt
-0.15
95
-0.15
akan
-0.14
缸
-0.14
arga
-0.13
puff
-0.13
rare
-0.13
OTH
-0.13
POSITIVE LOGITS
wait
0.44
Wait
0.40
wait
0.40
Wait
0.38
WAIT
0.33
WAIT
0.30
.wait
0.29
_wait
0.29
scratch
0.28
waits
0.28
Activations Density 0.089%