INDEX
Explanations
sentences talking about change or transformation
instances of the word "the."
New Auto-Interp
Negative Logits
SPONSORED
-0.86
replace
-0.75
tumblr
-0.74
mong
-0.74
alone
-0.68
demonstrates
-0.67
lov
-0.66
usa
-0.66
rand
-0.64
ESPN
-0.62
POSITIVE LOGITS
entire
1.27
severity
0.99
whole
0.98
possibility
0.98
entirety
0.98
perception
0.97
effectiveness
0.92
smallest
0.91
proverbial
0.91
slightest
0.90
Activations Density 0.253%