INDEX
Explanations
phrases related to vigilance or attentiveness
New Auto-Interp
Negative Logits
ynes
-0.18
Marr
-0.16
uai
-0.15
uais
-0.14
_reserved
-0.14
ersistence
-0.14
jez
-0.14
Ñĥнк
-0.14
/nginx
-0.13
urat
-0.13
POSITIVE LOGITS
watch
0.31
lookout
0.29
.watch
0.25
out
0.25
Watch
0.24
-watch
0.23
Watch
0.23
peeled
0.23
watches
0.23
watch
0.22
Activations Density 0.031%