INDEX
Explanations
phrases that emphasize certainty or affirmation
New Auto-Interp
Negative Logits
_WP
-0.13
-await
-0.13
isors
-0.13
dint
-0.12
anything
-0.12
965
-0.12
ieres
-0.12
ovu
-0.12
Anything
-0.12
agua
-0.12
POSITIVE LOGITS
many
0.27
certain
0.26
times
0.25
few
0.24
several
0.23
nt
0.22
no
0.21
two
0.21
many
0.20
still
0.20
Activations Density 0.071%