INDEX
Explanations
words related to unauthorized or unacceptable actions or situations
terms related to validity, approval, and status of actions or entities
New Auto-Interp
Negative Logits
ioxide
-0.82
Clar
-0.72
zel
-0.71
otto
-0.70
ois
-0.69
alon
-0.69
frey
-0.66
udeb
-0.65
APH
-0.65
crit
-0.65
POSITIVE LOGITS
distractions
0.77
adoes
0.74
excuses
0.73
nesses
0.73
anymore
0.70
wastes
0.70
pregnancies
0.69
Abandon
0.65
ly
0.64
Territory
0.63
Activations Density 0.123%