INDEX
Explanations
references to visual imagery or photographs
New Auto-Interp
Negative Logits
åij³
-0.17
ring
-0.15
ities
-0.15
ilet
-0.15
osh
-0.15
ongo
-0.15
lei
-0.15
kuru
-0.15
Ùij
-0.14
shot
-0.14
POSITIVE LOGITS
orial
0.23
hÆ°á»Łng
0.20
-per
0.19
ocks
0.19
perfect
0.18
ofday
0.17
perfect
0.17
SizeMode
0.16
ockey
0.16
colo
0.16
Activations Density 0.033%