INDEX
Explanations
words related to social media behavior, such as "posted" and "selfie"
instances of posts or sharing content on social media platforms
New Auto-Interp
Negative Logits
itect
-0.70
clave
-0.66
osi
-0.65
ichick
-0.63
ahime
-0.61
cale
-0.60
Lauder
-0.59
umbledore
-0.59
YD
-0.59
essen
-0.59
POSITIVE LOGITS
pics
1.12
screenshots
1.10
pictures
1.02
photos
1.01
Tweet
0.98
onto
0.94
selfies
0.94
flyers
0.90
tweets
0.89
selfie
0.89
Activations Density 0.234%