INDEX
Explanations
mentions of the news channel "Fox News"
references to Fox News
New Auto-Interp
Negative Logits
acterial
-0.77
rians
-0.75
akeru
-0.71
rian
-0.71
inval
-0.70
attled
-0.69
ansom
-0.68
ochemical
-0.68
Definition
-0.67
orically
-0.67
POSITIVE LOGITS
conn
1.36
hawk
0.92
Fox
0.88
News
0.84
Fox
0.83
fox
0.83
woods
0.82
cat
0.79
News
0.78
FOX
0.77
Activations Density 0.015%