INDEX
Explanations
words related to news articles or journalistic content
references to articles or segments within articles
New Auto-Interp
Negative Logits
asters
-0.74
alth
-0.73
kson
-0.72
cffffcc
-0.71
enthal
-0.68
aterasu
-0.66
uits
-0.65
merce
-0.64
bledon
-0.64
aukee
-0.64
POSITIVE LOGITS
Continued
1.20
CONTIN
0.91
continues
0.89
Contin
0.86
ICLE
0.77
excerpt
0.70
omitted
0.70
VIII
0.68
VII
0.67
contents
0.67
Activations Density 0.014%