INDEX
Explanations
references to the Fox network and its associated news shows
New Auto-Interp
Negative Logits
еÑĢж
-0.18
urve
-0.16
ading
-0.16
\CMS
-0.15
usa
-0.15
erno
-0.15
lük
-0.15
antu
-0.14
ering
-0.14
akin
-0.14
POSITIVE LOGITS
conn
0.28
Roths
0.25
worthy
0.24
411
0.23
CONN
0.23
croft
0.23
xy
0.22
hole
0.21
borough
0.20
woods
0.20
Activations Density 0.009%