INDEX
    Explanations

    positive adjectives

    New Auto-Interp
    Negative Logits
     bowel
    -0.07
     AMD
    -0.07
     qr
    -0.07
     Cd
    -0.06
    .newLine
    -0.06
     умовах
    -0.06
    -0.06
    (xi
    -0.06
     mnoha
    -0.06
    ц
    -0.06
    POSITIVE LOGITS
    rief
    0.06
    ="<
    0.06
    .isHidden
    0.06
    ustrial
    0.06
    HW
    0.06
     webs
    0.06
     Gregg
    0.06
    CT
    0.06
    _critical
    0.06
    ME
    0.06
    Act Density 0.425%

    No Known Activations