INDEX
Explanations
social media platform names
mentions of social media platforms and online sharing options
New Auto-Interp
Negative Logits
manif
-0.64
rapp
-0.60
multipl
-0.53
illon
-0.53
silhou
-0.52
sol
-0.52
asphalt
-0.51
kef
-0.51
Pist
-0.51
tent
-0.50
POSITIVE LOGITS
ificate
0.70
BSD
0.67
outube
0.66
Tag
0.63
Tumblr
0.63
Afee
0.63
agascar
0.63
0.63
vertising
0.62
0.62
Activations Density 0.064%