INDEX
Explanations
advertisement-related terms
instances of advertisements
New Auto-Interp
Negative Logits
assic
-0.75
gorilla
-0.65
mates
-0.64
ecstasy
-0.62
mate
-0.61
Lans
-0.61
marsh
-0.61
gran
-0.60
subsistence
-0.59
Dull
-0.58
POSITIVE LOGITS
Skip
0.89
Comments
0.79
ROR
0.77
Images
0.76
VERTISEMENT
0.76
Thumbnails
0.75
Stories
0.72
Subscribe
0.72
Continue
0.71
Links
0.70
Activations Density 0.023%