INDEX
    Explanations

    negations and conditional phrases

    New Auto-Interp
    Negative Logits
     we
    -0.63
     We
    -0.57
     I
    -0.56
     and
    -0.50
     by
    -0.49
     who
    -0.49
    <bos>
    -0.49
     for
    -0.49
     he
    -0.49
     rahat
    -0.48
    POSITIVE LOGITS
     itſelf
    1.07
    Билгалдахарш
    0.94
     gyhoeddwyd
    0.87
     unknownFields
    0.86
    jsonwebtoken
    0.83
     ErrIntOverflow
    0.83
    MigrationBuilder
    0.81
     mergeFrom
    0.81
     للمعارف
    0.80
    CloseOperation
    0.80
    Act Density 0.866%

    No Known Activations