INDEX
Explanations
social media platforms
references to the social media platform Pinterest
New Auto-Interp
Negative Logits
phas
-0.72
nesty
-0.71
\\\\\\\\\\\\\\\\
-0.68
philis
-0.68
selves
-0.67
ysis
-0.66
quartered
-0.65
ppo
-0.65
scrimmage
-0.65
adesh
-0.65
POSITIVE LOGITS
0.94
0.90
PHOTO
0.78
0.75
sylv
0.75
iflower
0.71
photo
0.70
Filter
0.70
lov
0.69
Database
0.67
Activations Density 0.046%