INDEX
    Explanations

    punctuation and formatting elements in the text

    New Auto-Interp
    Negative Logits
    оÑĢд
    -0.19
     Truy
    -0.16
    etrofit
    -0.15
    edor
    -0.15
    ithub
    -0.14
    .mj
    -0.14
    ioxid
    -0.14
    ÑĢиз
    -0.14
    imer
    -0.13
    viders
    -0.13
    POSITIVE LOGITS
    eam
    0.16
    lean
    0.15
    eya
    0.15
    imo
    0.14
     �
    0.14
     Rifle
    0.14
    ä¹İ
    0.14
    _AUX
    0.14
    TURE
    0.13
    rå
    0.13
    Act Density 0.006%

    No Known Activations