INDEX
Explanations
mentions of advertisements in a text
occurrences of the word "advertisement."
New Auto-Interp
Negative Logits
clud
-0.71
come
-0.70
tein
-0.70
heed
-0.70
honour
-0.67
erva
-0.66
honor
-0.64
iency
-0.64
ologically
-0.64
itialized
-0.64
POSITIVE LOGITS
Advertisement
1.08
Thumbnails
0.92
advertisement
0.89
Transcript
0.85
culosis
0.83
Interstitial
0.79
Skip
0.76
iban
0.73
advertisement
0.73
Ads
0.73
Activations Density 0.003%