INDEX
Explanations
references to social media platforms, with a particular focus on Pinterest
references to the social media platform Pinterest
New Auto-Interp
Negative Logits
phas
-0.74
itbart
-0.73
hood
-0.73
redes
-0.66
naire
-0.65
enegger
-0.65
ppo
-0.64
lves
-0.64
holm
-0.63
Dign
-0.63
POSITIVE LOGITS
1.04
0.91
Filter
0.79
0.79
PHOTO
0.76
sylv
0.76
Photos
0.76
icket
0.75
User
0.70
Password
0.69
Activations Density 0.020%