INDEX
Explanations
Twitter handles
mentions of Twitter and its associated functionalities
New Auto-Interp
Negative Logits
moot
-0.60
cised
-0.59
poaching
-0.56
homosexuals
-0.55
cruising
-0.55
Dracula
-0.54
Monk
-0.54
gearing
-0.53
swelling
-0.53
prol
-0.51
POSITIVE LOGITS
@
0.97
(@
0.96
Follow
0.75
Follow
0.74
hashtag
0.74
ðŁij
0.71
@
0.68
#$
0.68
"@
0.68
Tweet
0.68
Activations Density 0.022%