INDEX
Explanations
references to buzzfeed and gawker
mentions of the media company BuzzFeed
New Auto-Interp
Negative Logits
semble
-0.84
atics
-0.75
nonviolent
-0.69
oppable
-0.68
stood
-0.66
BuyableInstoreAndOnline
-0.65
jury
-0.65
abeth
-0.64
cludes
-0.64
arij
-0.63
POSITIVE LOGITS
Leaks
1.01
BuzzFeed
0.95
News
0.95
pedia
0.93
NEWS
0.91
Feed
0.87
Buzz
0.86
lus
0.84
lish
0.78
][
0.77
Activations Density 0.008%