INDEX
Explanations
variations of the phrase "I don't."
instances of the phrase "I don't"
New Auto-Interp
Negative Logits
Dise
-0.75
VERS
-0.70
afore
-0.70
sighted
-0.68
DOC
-0.67
ategory
-0.66
milo
-0.66
Vision
-0.64
Ability
-0.64
cised
-0.63
POSITIVE LOGITS
't
1.57
uts
0.96
kie
0.88
nell
0.84
nie
0.82
ned
0.82
ÃŃ
0.80
ning
0.77
ates
0.76
ct
0.75
Activations Density 0.057%