INDEX
    Explanations

    UI elements

    New Auto-Interp
    Negative Logits
     build
    -0.07
     Yes
    -0.07
     libert
    -0.06
     Cant
    -0.06
    -------------↵
    -0.06
    wc
    -0.06
    Ice
    -0.06
    sheets
    -0.06
    ولي
    -0.06
     forbidden
    -0.06
    POSITIVE LOGITS
    ного
    0.06
     plaintext
    0.06
    0.06
     января
    0.06
     Inspir
    0.06
     cavity
    0.06
     remed
    0.06
    0.06
    Vision
    0.06
     derivatives
    0.06
    Act Density 0.033%

    No Known Activations