INDEX
Explanations
mentions of the media outlet "Fox."
New Auto-Interp
Negative Logits
ering
-0.17
ocal
-0.17
еÑĢж
-0.16
ninger
-0.15
ory
-0.15
ustin
-0.15
ignum
-0.15
elda
-0.15
sect
-0.14
oner
-0.14
POSITIVE LOGITS
conn
0.25
xy
0.21
Roths
0.20
CONN
0.20
boro
0.20
worthy
0.20
borough
0.20
croft
0.19
sports
0.18
Trot
0.18
Activations Density 0.008%