INDEX
Explanations
phrases or words related to restrictions or prohibitions
negative phrases or terms, particularly focusing on the concept of "no" as a prefix to describe restrictions or limitations
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨ
-0.90
APTER
-0.81
å§«
-0.79
ãĥ¼ãĥĨãĤ£
-0.78
çīĪ
-0.77
Reloaded
-0.76
Ń·
-0.75
gypt
-0.68
Bohem
-0.68
Royale
-0.67
POSITIVE LOGITS
brainer
1.04
reply
0.98
know
0.92
interest
0.92
matter
0.92
issue
0.91
notice
0.91
purpose
0.90
practice
0.90
move
0.89
Activations Density 0.018%