INDEX
Explanations
mentions of specific publications or news outlets
New Auto-Interp
Negative Logits
wana
-0.76
ABE
-0.75
urtles
-0.69
ament
-0.68
etime
-0.66
achev
-0.66
zsche
-0.65
ayer
-0.65
ements
-0.65
gradient
-0.64
POSITIVE LOGITS
Chronicle
1.05
Herald
0.92
naire
0.81
Magazine
0.71
Books
0.70
itect
0.69
Newsp
0.69
mare
0.68
Phoenix
0.66
Pages
0.65
Activations Density 0.019%