INDEX
Explanations
news-related keywords
the term "READ" followed by a call to action or further engagement
New Auto-Interp
Negative Logits
eleg
-0.69
strat
-0.67
humor
-0.64
harm
-0.63
neutral
-0.63
Chase
-0.63
intrusion
-0.62
premium
-0.60
induction
-0.60
Cantor
-0.59
POSITIVE LOGITS
READ
4.33
WATCH
1.86
READ
1.84
read
1.75
SPONSORED
1.68
RELATED
1.65
Read
1.63
reads
1.57
reading
1.49
VIDEO
1.49
Activations Density 0.010%