INDEX
Explanations
phrases indicating a cessation or halt in activity or progress
New Auto-Interp
Negative Logits
adoo
-0.18
theid
-0.18
urma
-0.17
CHIP
-0.15
adera
-0.15
iox
-0.15
raz
-0.15
ContextHolder
-0.14
ongo
-0.14
ocz
-0.14
POSITIVE LOGITS
halt
0.29
abrupt
0.26
sc
0.25
grinding
0.24
stand
0.22
crashing
0.22
ing
0.20
merc
0.20
merc
0.19
end
0.18
Activations Density 0.020%