INDEX
Explanations
advertisements within text
occurrences of the word "advertisement."
New Auto-Interp
Negative Logits
iland
-0.75
come
-0.72
ighed
-0.72
ologically
-0.71
ited
-0.69
adows
-0.69
cible
-0.68
ologic
-0.67
cil
-0.66
jer
-0.66
POSITIVE LOGITS
Advertisement
1.01
advertisement
0.97
Thumbnails
0.81
Interstitial
0.77
vertising
0.75
Transcript
0.75
culosis
0.72
eering
0.72
ļéĨĴ
0.71
advertisements
0.71
Activations Density 0.007%