INDEX
Explanations
phrases related to news articles
occurrences of the word "The"
New Auto-Interp
Negative Logits
pecially
-0.64
leve
-0.63
entertained
-0.63
patiently
-0.62
knit
-0.62
thood
-0.62
shown
-0.60
ratified
-0.59
beware
-0.59
(%)
-0.59
POSITIVE LOGITS
oret
1.19
odor
1.11
resa
1.08
Economist
1.02
atre
0.99
sis
0.93
ories
0.89
Huffington
0.89
Chronicle
0.88
Basics
0.84
Activations Density 0.086%