INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    idis
    -0.20
    ixel
    -0.18
    >Main
    -0.17
    illet
    -0.16
    elt
    -0.16
    >Show
    -0.15
    >Delete
    -0.15
    gaard
    -0.15
    кин
    -0.14
     Gifts
    -0.14
    POSITIVE LOGITS
     align
    0.20
    ngo
    0.19
     style
    0.19
    ><
    0.19
     ALIGN
    0.18
    Align
    0.17
    align
    0.17
     Align
    0.17
     class
    0.17
    ustos
    0.17
    Act Density 0.024%

    No Known Activations