INDEX
Explanations
words or phrases related to advertisements
occurrences of advertisements
New Auto-Interp
Negative Logits
bred
-0.77
joy
-0.73
Kut
-0.71
Myster
-0.66
invite
-0.66
pled
-0.64
motivating
-0.64
Ultr
-0.64
ker
-0.63
Koch
-0.63
POSITIVE LOGITS
ADVERTISEMENT
1.34
iciary
1.06
vertising
0.91
interstitial
0.85
ħĭ
0.84
swer
0.82
flush
0.81
citiz
0.81
taboola
0.81
Skip
0.80
Activations Density 0.004%