INDEX
Explanations
mentions of specific news outlets
New Auto-Interp
Negative Logits
ation
-0.17
elda
-0.16
ignum
-0.15
hin
-0.15
ourcem
-0.15
ñana
-0.15
exus
-0.15
atus
-0.15
andex
-0.15
fant
-0.14
POSITIVE LOGITS
hawk
0.16
croft
0.16
ridge
0.16
ahoo
0.15
mill
0.15
acco
0.15
scape
0.15
yst
0.15
piler
0.14
Avery
0.14
Activations Density 0.018%