INDEX
Explanations
contractions with "doesn't" or "don't"
instances of the word "doesn't" or its variations
New Auto-Interp
Negative Logits
fox
-0.75
hung
-0.67
armed
-0.65
Remastered
-0.65
dar
-0.63
phant
-0.62
ANG
-0.62
Mant
-0.61
Ability
-0.61
tone
-0.60
POSITIVE LOGITS
't
1.47
kie
0.87
ettings
0.82
berra
0.77
paces
0.75
olulu
0.73
terness
0.73
ajor
0.72
NOT
0.71
\'
0.70
Activations Density 0.026%