INDEX
    Explanations

    code/configuration files

    New Auto-Interp
    Negative Logits
    -0.07
    Hash
    -0.06
     *=
    -0.06
     Ž
    -0.06
    Am
    -0.06
    اغ
    -0.05
    -0.05
     Sak
    -0.05
    -0.05
    -0.05
    POSITIVE LOGITS
    aturally
    0.07
    iano
    0.07
     ban
    0.07
    0.07
    ↵↵    ↵
    0.07
     Pla
    0.07
    pected
    0.06
    /context
    0.06
     disrupting
    0.06
     RTS
    0.06
    Act Density 0.010%

    No Known Activations