INDEX
    Explanations

    references to negation or the word "nor"

    New Auto-Interp
    Negative Logits
    PTH
    -0.16
    ITTE
    -0.16
    bak
    -0.15
     Bak
    -0.15
    erable
    -0.15
     bak
    -0.15
    į
    -0.14
    ouse
    -0.14
    áh
    -0.14
    ancell
    -0.14
    POSITIVE LOGITS
    icont
    0.15
     nail
    0.15
     writing
    0.15
     Cad
    0.14
     Dial
    0.14
     Bre
    0.14
     Bol
    0.14
    ancode
    0.14
     Lena
    0.14
     cad
    0.13
    Act Density 0.006%

    No Known Activations