INDEX
Explanations
instances of stopping or pausing actions within a narrative
New Auto-Interp
Negative Logits
ikh
-0.17
kh
-0.17
crown
-0.15
chl
-0.15
errat
-0.15
zzo
-0.14
Crown
-0.14
_TRACE
-0.14
encing
-0.14
moder
-0.14
POSITIVE LOGITS
-stop
0.16
971
0.15
ftar
0.15
690
0.15
à¤ªà¥ľ
0.15
çek
0.15
à¹Ģà¸Ĺ
0.15
933
0.14
STOP
0.14
idar
0.14
Activations Density 0.157%