INDEX
Explanations
mentions or references to the media outlet "FOX" in text
mentions of the FOX network
New Auto-Interp
Negative Logits
abee
-0.87
ages
-0.81
iano
-0.79
idences
-0.74
arella
-0.73
Edison
-0.73
atton
-0.71
uates
-0.69
quished
-0.69
abus
-0.69
POSITIVE LOGITS
CAST
0.99
DEN
0.93
ged
0.86
ĪĴ
0.86
çīĪ
0.86
ger
0.83
IUM
0.82
MAT
0.82
ging
0.81
strate
0.81
Activations Density 0.045%