INDEX
Explanations
advertisements within the text
the presence of advertisements
New Auto-Interp
Negative Logits
ties
-0.70
graded
-0.67
retri
-0.63
stra
-0.63
makeshift
-0.61
mate
-0.60
perspect
-0.60
helicop
-0.59
actu
-0.58
stood
-0.57
POSITIVE LOGITS
Advertisement
1.09
Continue
1.04
@@
0.74
ieu
0.72
usercontent
0.71
Advertisement
0.71
Images
0.71
Exit
0.70
advertisement
0.69
Image
0.68
Activations Density 0.021%