INDEX
Explanations
references to the color blue
New Auto-Interp
Negative Logits
nee
-0.18
rei
-0.17
lov
-0.17
teen
-0.16
la
-0.15
lp
-0.15
di
-0.15
los
-0.15
serie
-0.15
num
-0.15
POSITIVE LOGITS
prints
0.38
grass
0.34
berry
0.33
berries
0.30
-collar
0.29
bird
0.28
jay
0.27
bell
0.26
chip
0.24
skies
0.24
Activations Density 0.021%