INDEX
Explanations
phrases and words indicating obligation or necessity
New Auto-Interp
Negative Logits
strapon
-0.14
Routine
-0.14
anim
-0.14
engin
-0.14
riend
-0.13
prite
-0.13
/shared
-0.13
dear
-0.13
Exclusive
-0.13
171
-0.13
POSITIVE LOGITS
åħį
0.19
avoid
0.18
alternative
0.18
Alternative
0.18
Alternative
0.17
avoid
0.17
shadow
0.17
ặp
0.16
shadow
0.16
echo
0.16
Activations Density 0.030%