INDEX
Explanations
promotional messages related to newsletters, subscriptions, and clubs
calls to subscribe or sign up for newsletters and clubs
New Auto-Interp
Negative Logits
unaccount
-0.64
lihood
-0.60
Klu
-0.60
conson
-0.60
beit
-0.59
cific
-0.58
pmwiki
-0.58
positives
-0.58
anz
-0.56
ersen
-0.55
POSITIVE LOGITS
>>>
0.68
CLASS
0.59
subscribe
0.57
Intel
0.56
Learn
0.55
``
0.55
Subscribe
0.54
smart
0.54
>>>
0.53
ILE
0.53
Activations Density 0.102%