INDEX
    Explanations

    negations and denials in statements

    New Auto-Interp
    Negative Logits
    etto
    -0.16
    idders
    -0.15
    огÑĢаÑĦ
    -0.14
     wakeup
    -0.14
    isine
    -0.14
    etti
    -0.14
    .motion
    -0.14
    irth
    -0.14
    ja
    -0.13
    ány
    -0.13
    POSITIVE LOGITS
    ndl
    0.18
    umas
    0.17
    γά
    0.16
    fts
    0.15
     Rout
    0.15
     yet
    0.15
    ropol
    0.15
    nad
    0.14
    remium
    0.14
     BK
    0.14
    Act Density 0.228%

    No Known Activations