INDEX
Explanations
instances of the contraction "you're."
New Auto-Interp
Negative Logits
awa
-0.18
rep
-0.16
zimmer
-0.15
styleType
-0.15
reach
-0.15
qus
-0.15
react
-0.15
prit
-0.15
reed
-0.15
CADE
-0.15
POSITIVE LOGITS
gonna
0.19
becca
0.16
able
0.16
s
0.16
eks
0.16
ihan
0.16
illy
0.16
igious
0.15
ified
0.15
nbsp
0.15
Activations Density 0.003%