INDEX
    Explanations

    temporal phrases related to the effects and outcomes of treatments

    New Auto-Interp
    Negative Logits
     itſelf
    -0.80
    ագրություններ
    -0.71
     fashiola
    -0.70
     queſto
    -0.68
    -0.67
    𑄧
    -0.63
    𑄟
    -0.63
    -0.63
     propOrder
    -0.63
    AndEndTag
    -0.63
    POSITIVE LOGITS
    ďaka
    0.30
     this
    0.28
     jspb
    0.26
     dzięki
    0.26
     nhờ
    0.25
    0.25
     efforts
    0.24
     esforço
    0.24
    новништво
    0.23
     Rich
    0.23
    Act Density 0.025%

    No Known Activations