INDEX
Explanations
photographs or images
occurrences of the word "Photos."
New Auto-Interp
Negative Logits
continu
-0.72
pipe
-0.70
tract
-0.69
undert
-0.69
incre
-0.66
tobacco
-0.65
grant
-0.65
downt
-0.64
deviation
-0.64
weaker
-0.64
POSITIVE LOGITS
Photos
4.07
Photos
2.44
photos
2.20
photos
2.09
Photo
1.92
Phot
1.86
Images
1.85
Photography
1.77
Pictures
1.75
Photograph
1.75
Activations Density 0.010%