INDEX
Explanations
mentions of the news source "BuzzFeed"
New Auto-Interp
Negative Logits
ipment
-0.77
nonviolent
-0.74
semble
-0.73
reper
-0.71
atics
-0.70
oppable
-0.69
nikov
-0.68
uctor
-0.68
regress
-0.67
semb
-0.67
POSITIVE LOGITS
News
1.16
pedia
0.99
News
0.96
NEWS
0.94
Feed
0.92
Buzz
0.92
BuzzFeed
0.90
Leaks
0.87
Comics
0.87
news
0.87
Activations Density 0.018%