INDEX
Explanations
phrases related to advertisements prompting the reader to continue reading a story
repeated phrases related to advertisements or prompts to proceed with reading
New Auto-Interp
Negative Logits
*/(
-0.70
ouls
-0.69
mith
-0.67
ionic
-0.64
axter
-0.63
lain
-0.63
knife
-0.62
ially
-0.62
otom
-0.62
gyn
-0.61
POSITIVE LOGITS
Reading
0.83
reading
0.82
scrolling
0.79
clicking
0.70
Loading
0.69
Continue
0.69
arming
0.62
taboola
0.61
Below
0.61
contacting
0.60
Activations Density 0.016%