INDEX
Explanations
news headlines indicating a location
parentheses in the text
New Auto-Interp
Negative Logits
expire
-0.71
overlap
-0.71
racuse
-0.68
surround
-0.67
glare
-0.66
retard
-0.66
majesty
-0.66
entitle
-0.64
multiplication
-0.64
sonic
-0.63
POSITIVE LOGITS
Reuters
1.17
MEN
1.14
via
1.14
formerly
1.09
CBS
1.08
pictured
1.03
AFP
1.03
TAG
1.02
BUS
0.99
...)
0.97
Activations Density 0.050%