INDEX
Explanations
phrases indicating the initiation or progression of actions and feelings
New Auto-Interp
Negative Logits
continue
-0.23
continued
-0.22
continue
-0.21
continuing
-0.21
continuation
-0.19
continues
-0.19
continue
-0.19
continued
-0.18
still
-0.18
continuar
-0.17
POSITIVE LOGITS
ying
0.21
поба
0.15
notice
0.15
serious
0.15
noticed
0.15
845
0.15
lesh
0.15
taper
0.15
seriously
0.15
NOTICE
0.15
Activations Density 0.035%