INDEX
    Explanations

    phrases indicating claims or statements of existence and being

    New Auto-Interp
    Negative Logits
    loy
    -0.14
     Blonde
    -0.14
     tend
    -0.14
    kil
    -0.13
    eral
    -0.13
    ccount
    -0.13
    lein
    -0.13
     Deniz
    -0.13
    334
    -0.13
    RT
    -0.13
    POSITIVE LOGITS
     be
    0.22
     have
    0.17
     contrary
    0.16
    .have
    0.15
    iani
    0.15
    ваÑĢ
    0.15
    oria
    0.15
    avern
    0.15
    iembre
    0.15
    rades
    0.14
    Act Density 0.059%

    No Known Activations