INDEX
    Explanations

    negations and conditional phrases indicating limitations or restrictions in a context

    New Auto-Interp
    Negative Logits
    tro
    -0.22
     tro
    -0.21
    é«ĺéĢŁ
    -0.17
    tl
    -0.15
     pole
    -0.15
    edin
    -0.15
    rich
    -0.15
    Tro
    -0.15
     def
    -0.14
    رخ
    -0.14
    POSITIVE LOGITS
     acceptance
    0.18
     Settlement
    0.18
     Start
    0.16
     Accept
    0.16
    opup
    0.16
     accepted
    0.16
    accepted
    0.16
    rell
    0.15
    accept
    0.15
    oldt
    0.15
    Act Density 0.030%

    No Known Activations