INDEX
Explanations
social media-related call-to-action terms
instances of the end-of-document token
New Auto-Interp
Negative Logits
minist
-0.81
pite
-0.75
ILCS
-0.72
endiary
-0.70
illac
-0.66
anyon
-0.62
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.62
inese
-0.61
bably
-0.61
coerced
-0.61
POSITIVE LOGITS
@
0.94
ers
0.87
Stories
0.86
ed
0.84
ership
0.84
Sym
0.84
HuffPost
0.78
Trend
0.75
Follow
0.72
Updates
0.71
Activations Density 0.022%