INDEX
    Explanations

    elements related to arguments and legal proceedings

    New Auto-Interp
    Negative Logits
     وين
    -0.53
    written
    -0.53
    poken
    -0.50
    Says
    -0.49
     ويل
    -0.49
    -0.49
     térm
    -0.48
     Añade
    -0.48
    Driven
    -0.48
     Spoken
    -0.48
    POSITIVE LOGITS
     was
    2.09
     took
    1.94
     went
    1.87
     did
    1.86
     gave
    1.79
     came
    1.69
     became
    1.65
     didn
    1.63
     showed
    1.63
     were
    1.60
    Act Density 5.536%

    No Known Activations