INDEX
Explanations
mentions of the television network "Fox News"
New Auto-Interp
Negative Logits
rians
-0.71
orically
-0.67
inval
-0.66
rogens
-0.65
quo
-0.65
rian
-0.63
rusty
-0.63
Archdemon
-0.62
reddits
-0.62
acterial
-0.62
POSITIVE LOGITS
conn
1.59
News
0.96
News
0.96
woods
0.92
croft
0.87
cat
0.87
fire
0.86
xes
0.84
xy
0.84
borough
0.84
Activations Density 0.028%