INDEX
Explanations
negation terms indicating dissent or contradiction
New Auto-Interp
Negative Logits
Datuak
-0.87
SOUNDBITE
-0.82
StoryboardSegue
-0.81
ConstraintMaker
-0.81
photolibrary
-0.73
nakalista
-0.73
noqa
-0.71
ſever
-0.71
Autoritní
-0.70
heiress
-0.70
POSITIVE LOGITS
not
0.98
Not
0.72
NOT
0.69
not
0.68
likely
0.66
Not
0.65
NOT
0.56
even
0.54
t
0.54
typically
0.54
Activations Density 0.310%