INDEX
Explanations
specific text indicating advertisements
the presence of advertisements or references to advertisements in the text
New Auto-Interp
Negative Logits
determination
-0.61
ĪĴ
-0.61
alist
-0.57
statistical
-0.55
awakening
-0.53
measurement
-0.52
tal
-0.52
accommodation
-0.52
deduct
-0.51
representative
-0.51
POSITIVE LOGITS
advertisement
0.77
Continue
0.74
embed
0.67
CONTIN
0.66
RELATED
0.62
Cancel
0.62
é¾įå
0.61
VIDEOS
0.61
Expand
0.61
Advertisement
0.59
Activations Density 0.029%