INDEX
    Explanations

    punctuation and structural elements in code

    New Auto-Interp
    Negative Logits
     MIS
    -0.17
    ilde
    -0.16
    .wp
    -0.15
     Rebellion
    -0.15
    ancies
    -0.14
     ØŃاÙ쨏
    -0.13
    ÑĢап
    -0.13
    zych
    -0.13
    oland
    -0.13
    icken
    -0.13
    POSITIVE LOGITS
    unless
    0.26
     unless
    0.25
     my
    0.22
     confess
    0.22
     die
    0.22
    my
    0.21
    cro
    0.20
    (my
    0.20
     scalar
    0.20
     print
    0.20
    Act Density 0.002%

    No Known Activations