INDEX
    Explanations

    the word "not" indicating negation or denial

    New Auto-Interp
    Negative Logits
    /animate
    -0.15
    auc
    -0.14
    onavir
    -0.13
     Ov
    -0.13
    ŀ
    -0.12
    uming
    -0.12
    .fix
    -0.12
    outil
    -0.12
    voor
    -0.12
    .writerow
    -0.12
    POSITIVE LOGITS
    azi
    0.15
    Been
    0.15
    .Extension
    0.15
    arer
    0.14
    arov
    0.14
    alama
    0.14
    olume
    0.14
    еÑĢп
    0.14
    adow
    0.14
    ORTH
    0.13
    Act Density 0.021%

    No Known Activations