INDEX
Explanations
references to images or photographs
references to photos and videos
New Auto-Interp
Negative Logits
stood
-0.90
vernment
-0.69
phi
-0.66
pperc
-0.65
rule
-0.65
nder
-0.64
assurance
-0.62
denomin
-0.62
speech
-0.61
period
-0.61
POSITIVE LOGITS
PHOTOS
1.22
IMAGES
1.11
PHOTO
1.09
Photos
1.06
Images
1.03
VIDEOS
1.02
Images
0.98
Photograph
0.94
photos
0.93
MORE
0.89
Activations Density 0.017%