INDEX
Explanations
instances of the word "told."
New Auto-Interp
Negative Logits
ILCS
-0.80
aband
-0.76
Interstitial
-0.75
isol
-0.75
illion
-0.71
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.71
é¾įåĸļ士
-0.69
ample
-0.68
agement
-0.66
sidx
-0.66
POSITIVE LOGITS
reporters
1.22
BuzzFeed
1.01
HuffPost
1.00
BBC
0.88
VICE
0.88
Politico
0.88
me
0.87
Guardian
0.86
Newsweek
0.84
CNBC
0.84
Activations Density 0.040%