INDEX
Explanations
promotions or deals offering rewards or benefits
phrases related to receiving benefits or rewards
New Auto-Interp
Negative Logits
igr
-0.82
lie
-0.69
worthy
-0.64
eem
-0.63
Recomm
-0.61
abin
-0.61
bourne
-0.60
eli
-0.60
factor
-0.59
fallacy
-0.58
POSITIVE LOGITS
rid
1.11
reimb
1.03
refunds
0.94
preferential
0.94
rewarded
0.92
bumped
0.88
access
0.87
compensated
0.86
paid
0.85
unlimited
0.81
Activations Density 0.097%