INDEX
Explanations
Twitter handles to follow
instances of the word "Follow" related to social media engagement
New Auto-Interp
Negative Logits
pite
-0.86
ILCS
-0.80
inese
-0.78
negie
-0.75
unicip
-0.72
aez
-0.71
atonin
-0.71
rouse
-0.70
Scotia
-0.68
urrection
-0.65
POSITIVE LOGITS
Follow
0.97
Follow
0.86
ers
0.84
cies
0.83
follow
0.81
ership
0.78
ed
0.75
Updates
0.70
ï¸ı
0.69
Īè
0.69
Activations Density 0.019%