INDEX
Explanations
user identifiers or hashtags
hashtags or labels used in social media posts
New Auto-Interp
Negative Logits
pestic
-0.72
Sov
-0.70
ahime
-0.70
veh
-0.70
weap
-0.69
stitial
-0.69
horizont
-0.68
Wander
-0.67
colle
-0.66
predec
-0.66
POSITIVE LOGITS
########
1.30
################################
1.29
################
1.14
###
0.91
##
0.82
Posts
0.82
nice
0.79
region
0.78
bitcoin
0.71
deck
0.71
Activations Density 0.007%