INDEX
Explanations
phrases or keywords related to news headlines or article titles
occurrences of the word "the."
New Auto-Interp
Negative Logits
perties
-0.75
lets
-0.73
eca
-0.70
vae
-0.67
alk
-0.65
iously
-0.65
opol
-0.63
uid
-0.63
Kern
-0.62
ae
-0.62
POSITIVE LOGITS
HEAD
1.47
STORY
1.44
WORK
1.43
THE
1.42
ABOUT
1.41
REPORT
1.41
BOOK
1.41
WORK
1.40
WEEK
1.40
WOR
1.40
Activations Density 0.192%