INDEX
Explanations
the color purple in various contexts
New Auto-Interp
Negative Logits
silver
-0.14
ivet
-0.14
MF
-0.14
Demir
-0.14
erra
-0.14
brunette
-0.14
orange
-0.14
lad
-0.14
aly
-0.14
black
-0.13
POSITIVE LOGITS
èī²çļĦ
0.18
-red
0.17
/red
0.16
prints
0.16
ìĥī
0.16
/blue
0.16
Ãło
0.15
oft
0.15
-coded
0.15
Łèĥ½
0.15
Activations Density 0.011%