INDEX
    Explanations

    phrases indicating uncertainty or reservation in a statement

    New Auto-Interp
    Negative Logits
     fta
    -1.30
     secon
    -1.29
     oner
    -1.29
     aen
    -1.28
     emphat
    -1.26
     „,
    -1.26
     seiz
    -1.26
     ?...
    -1.21
     hcm
    -1.21
     perfon
    -1.21
    POSITIVE LOGITS
     digress
    0.76
     also
    0.67
     still
    0.64
     também
    0.62
     aren
    0.59
     isn
    0.59
     wasn
    0.59
     don
    0.58
     worse
    0.57
     didn
    0.57
    Act Density 0.397%

    No Known Activations