INDEX
Explanations
phrases indicating obligation or assertion
New Auto-Interp
Negative Logits
=Value
-0.15
pollo
-0.15
ãĥĥãĥĪ
-0.14
isclosed
-0.14
quisition
-0.14
igg
-0.14
usk
-0.14
799
-0.13
ucket
-0.13
nier
-0.13
POSITIVE LOGITS
incumbent
0.25
vit
0.23
past
0.20
urgent
0.20
vital
0.19
urgent
0.19
critically
0.18
Past
0.18
essential
0.18
critical
0.17
Activations Density 0.133%