INDEX
Explanations
emphatic statements emphasizing certainty or intensity
New Auto-Interp
Negative Logits
ison
-0.18
pty
-0.14
ocide
-0.14
iele
-0.14
sdale
-0.14
elor
-0.14
unborn
-0.14
retty
-0.14
ial
-0.13
ur
-0.13
POSITIVE LOGITS
positively
0.22
OLUTE
0.21
olutely
0.21
-ÑĤаки
0.20
-zero
0.18
zero
0.18
correct
0.17
absolutely
0.16
ley
0.16
querque
0.16
Activations Density 0.017%