INDEX
    Explanations

    the color purple in various contexts

    New Auto-Interp
    Negative Logits
     silver
    -0.14
    ivet
    -0.14
    MF
    -0.14
     Demir
    -0.14
    erra
    -0.14
     brunette
    -0.14
     orange
    -0.14
    lad
    -0.14
    aly
    -0.14
     black
    -0.13
    POSITIVE LOGITS
    èī²çļĦ
    0.18
    -red
    0.17
    /red
    0.16
    prints
    0.16
    ìĥī
    0.16
    /blue
    0.16
    Ãło
    0.15
    oft
    0.15
    -coded
    0.15
    Łèĥ½
    0.15
    Act Density 0.011%

    No Known Activations