INDEX
Explanations
specific phrases encouraging newsletter subscriptions
phrases indicating the source of information or stories
New Auto-Interp
Negative Logits
imates
-0.77
imental
-0.75
heim
-0.74
imate
-0.73
omsky
-0.69
gow
-0.69
Í
-0.67
retard
-0.65
RGB
-0.62
hetics
-0.62
POSITIVE LOGITS
clicking
1.35
signing
1.18
subscribing
1.16
bookmark
1.08
downloading
1.06
submitting
1.03
joining
1.01
selecting
1.01
liking
0.98
visiting
0.96
Activations Density 0.026%