INDEX
Explanations
news headlines or article titles
instances of the word "READ" indicating calls to action or prompts for further information
New Auto-Interp
Negative Logits
phies
-0.68
angers
-0.67
phi
-0.66
phy
-0.66
aviour
-0.64
Bret
-0.64
bows
-0.63
asketball
-0.63
amel
-0.62
phia
-0.62
POSITIVE LOGITS
ALSO
1.01
ING
0.95
MORE
0.94
MORE
0.89
LIST
0.86
INGS
0.85
UPDATE
0.84
DOWN
0.81
BOOK
0.80
ERS
0.80
Activations Density 0.010%