INDEX
Explanations
references to colors, particularly shades of purple, pink, blue, and black
colors and visual descriptors
New Auto-Interp
Negative Logits
queſto
-0.73
vooz
-0.73
laſſen
-0.73
iſten
-0.69
beſti
-0.68
<unused41>
-0.68
zwiſchen
-0.68
<unused23>
-0.68
<unused20>
-0.68
<unused17>
-0.68
POSITIVE LOGITS
ioutil
0.27
Außer
0.27
colour
0.27
occupe
0.25
deutschen
0.25
amerikanischen
0.24
ganzen
0.24
weißen
0.24
labelledby
0.23
color
0.23
Activations Density 0.016%