INDEX
    Explanations

    programming code

    New Auto-Interp
    Negative Logits
    üssen
    -0.08
     Eduardo
    -0.07
     surgery
    -0.07
     конкур
    -0.07
    -0.07
    Win
    -0.07
    承担
    -0.06
    رؤ
    -0.06
    -0.06
    <<<<
    -0.06
    POSITIVE LOGITS
    ובר
    0.07
    aty
    0.07
    اري
    0.07
    $t
    0.07
     cloth
    0.07
     welfare
    0.06
    ucken
    0.06
     rear
    0.06
     матч
    0.06
    Watch
    0.06
    Act Density 0.069%

    No Known Activations