INDEX
Explanations
mentions of the word "Ad" followed by a number
references to advertisements or promotional content
New Auto-Interp
Negative Logits
OUP
-0.79
terday
-0.74
20439
-0.72
ANGE
-0.71
externalActionCode
-0.69
ĸļ
-0.68
Burk
-0.67
KEY
-0.65
Ń·
-0.64
Kubrick
-0.62
POSITIVE LOGITS
vertis
1.33
vertisements
1.30
vertising
1.10
olesc
1.10
ept
1.06
mittedly
1.06
mit
0.98
olescent
0.98
hesive
0.96
elaide
0.96
Activations Density 0.006%