INDEX
Explanations
advertisements within the text
occurrences of advertisements within the text
New Auto-Interp
Negative Logits
stra
-0.66
ties
-0.64
wound
-0.63
retri
-0.62
©¶æ
-0.61
graded
-0.61
perspect
-0.60
contingent
-0.57
balls
-0.57
Gaw
-0.56
POSITIVE LOGITS
Advertisement
1.15
Continue
1.04
ieu
0.82
Skip
0.78
advertising
0.77
usercontent
0.75
@@
0.74
Continue
0.74
Images
0.73
Image
0.70
Activations Density 0.021%