INDEX
Explanations
news articles or headlines from the Reuters agency
references to the news agency Reuters
New Auto-Interp
Negative Logits
haun
-0.75
adm
-0.67
flo
-0.63
fasc
-0.61
naires
-0.60
chrom
-0.59
iencies
-0.58
comprom
-0.57
quer
-0.57
tein
-0.57
POSITIVE LOGITS
)—
0.74
)-
0.72
)
0.71
Images
0.69
Reuters
0.67
)|
0.65
ARTICLE
0.63
Coverage
0.62
Spoiler
0.62
PLA
0.62
Activations Density 0.015%