INDEX
Explanations
mentions of news agencies or media outlets
New Auto-Interp
Negative Logits
starter
-0.70
adobe
-0.70
atorium
-0.69
̶
-0.65
starting
-0.65
powers
-0.64
termin
-0.63
ighton
-0.63
Bowser
-0.63
cus
-0.62
POSITIVE LOGITS
VIDEOS
1.00
EG
0.85
HY
0.84
NEWS
0.84
ews
0.83
PRESS
0.80
INESS
0.78
IMAGES
0.78
olitics
0.76
News
0.76
Activations Density 0.020%