INDEX
    Explanations

    references to literature and authorship

    New Auto-Interp
    Negative Logits
    ogne
    -0.15
    cio
    -0.15
    á»iji
    -0.15
     bordel
    -0.15
    urum
    -0.15
    PCODE
    -0.14
    rze
    -0.14
     chore
    -0.14
    Activate
    -0.14
    баÑĩ
    -0.14
    POSITIVE LOGITS
     Sketch
    0.19
    Sketch
    0.17
     Outline
    0.17
    ergarten
    0.16
    atile
    0.16
    оба
    0.15
     Traits
    0.15
    ooks
    0.15
     Serialization
    0.15
    иÑĢ
    0.15
    Act Density 0.080%

    No Known Activations