INDEX
Explanations
phrases indicating the initiation or progression of actions or feelings
New Auto-Interp
Negative Logits
continue
-0.22
continue
-0.21
continued
-0.20
continuing
-0.20
continuation
-0.18
continues
-0.18
continu
-0.17
continued
-0.17
continue
-0.17
continuar
-0.17
POSITIVE LOGITS
ying
0.18
поба
0.17
NOTICE
0.17
notice
0.17
Feel
0.16
TLS
0.15
taper
0.15
lesh
0.15
noticing
0.15
noticed
0.15
Activations Density 0.058%