INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     належ
    -0.06
     deduction
    -0.06
    (.
    -0.06
     affine
    -0.06
    epsilon
    -0.06
     Override
    -0.06
     rd
    -0.06
     علیه
    -0.06
    Nh
    -0.06
    .
    -0.06
    POSITIVE LOGITS
     risk
    0.14
     Risk
    0.14
     risks
    0.13
    risk
    0.12
    Risk
    0.11
     dangers
    0.10
    -risk
    0.09
    isks
    0.07
    ritt
    0.07
     risking
    0.07
    Act Density 0.021%

    No Known Activations