INDEX
Explanations
contractions involving "don't" and the word "don't" by itself
instances of the phrase "don't" or its variations
New Auto-Interp
Negative Logits
Reality
-0.66
Butt
-0.64
Remastered
-0.63
cised
-0.62
Completed
-0.61
DRAGON
-0.61
Advice
-0.61
afore
-0.60
Dise
-0.60
Personality
-0.60
POSITIVE LOGITS
't
1.35
ned
0.99
ning
0.85
ates
0.82
ners
0.80
ated
0.78
etsk
0.78
nels
0.74
ate
0.74
uts
0.74
Activations Density 0.106%