INDEX
Explanations
dates mentioned in news articles
parentheses in the text
New Auto-Interp
Negative Logits
langu
-0.78
discrim
-0.72
asteroids
-0.70
altogether
-0.69
increment
-0.65
unsett
-0.65
attacker
-0.65
skyrocket
-0.65
onward
-0.65
adversary
-0.65
POSITIVE LOGITS
Photo
1.52
Courtesy
1.37
photo
1.25
Credit
1.25
credit
1.25
Picture
1.07
Reuters
1.06
Screenshot
1.06
PHOTO
1.04
Courtesy
1.04
Activations Density 0.057%