INDEX
    Explanations

    phrases indicating absence or negation

    New Auto-Interp
    Negative Logits
    asl
    -0.15
     dh
    -0.14
    kses
    -0.14
     Brock
    -0.14
    enos
    -0.14
    gether
    -0.14
     al
    -0.13
     Marin
    -0.13
    otti
    -0.13
     IIC
    -0.13
    POSITIVE LOGITS
    ä»»ä½ķ
    0.24
     altogether
    0.23
     вообÑīе
    0.22
     any
    0.20
     žádné
    0.19
    CKER
    0.17
     vůbec
    0.17
     ÙĩÛĮÚĨ
    0.16
     Alto
    0.16
     ANY
    0.16
    Act Density 0.263%

    No Known Activations