INDEX
Explanations
instances of sponsorship or endorsement
instances of the word "sponsored."
New Auto-Interp
Negative Logits
erion
-0.81
esa
-0.79
eeks
-0.78
erm
-0.75
dam
-0.73
endi
-0.71
esville
-0.71
fighters
-0.70
emb
-0.70
fol
-0.67
POSITIVE LOGITS
ĸļ
0.80
paren
0.70
ividual
0.69
nesday
0.69
snowball
0.66
monton
0.65
Guest
0.65
interstitial
0.65
BY
0.64
authored
0.62
Activations Density 0.061%