INDEX
    Explanations

    negations and terms associated with conditions or states of being

    New Auto-Interp
    Negative Logits
    slack
    -0.14
     Nack
    -0.14
    ATTER
    -0.14
     useForm
    -0.14
    aron
    -0.14
    uos
    -0.14
    inar
    -0.13
    erna
    -0.13
    atter
    -0.13
    anda
    -0.13
    POSITIVE LOGITS
     pas
    0.42
     PAS
    0.32
     Pas
    0.31
    Pas
    0.31
    pas
    0.30
    _pas
    0.27
     pasa
    0.22
     pás
    0.19
     jamais
    0.19
     gu
    0.19
    Act Density 0.009%

    No Known Activations