INDEX
Explanations
phrases starting with "Don't"
phrases that use the contraction "don't."
New Auto-Interp
Negative Logits
Reviewer
-0.79
amer
-0.77
Site
-0.65
bard
-0.65
itiz
-0.65
interstitial
-0.64
éĥ
-0.64
Britann
-0.64
è£ıè
-0.64
cano
-0.63
POSITIVE LOGITS
necessarily
1.23
bother
1.05
exactly
1.01
icably
0.96
icable
0.92
gonna
0.90
really
0.88
quite
0.88
urtles
0.84
even
0.82
Activations Density 0.103%