INDEX
Explanations
content related to social media platforms, particularly Pinterest
references to the social media platform Pinterest
New Auto-Interp
Negative Logits
ysis
-0.85
redes
-0.79
hood
-0.76
ologne
-0.66
phas
-0.64
Lauder
-0.63
itbart
-0.62
nd
-0.62
ppo
-0.62
odore
-0.62
POSITIVE LOGITS
0.91
0.90
sylv
0.82
ipedia
0.75
Filter
0.73
icker
0.72
PHOTO
0.72
0.70
User
0.68
cher
0.67
Activations Density 0.020%