INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    عامل
    -0.09
     والث
    -0.08
     concernés
    -0.08
     الث
    -0.08
    .CO
    -0.08
     대상으로
    -0.08
     التعامل
    -0.07
    —including
    -0.07
     بب
    -0.07
    িতে
    -0.07
    POSITIVE LOGITS
     Manchester
    0.08
    0.08
    Manchester
    0.08
    Injected
    0.08
     Figur
    0.08
    educt
    0.08
    なく
    0.08
    orse
    0.07
     sate
    0.07
     quizá
    0.07
    Act Density 0.002%

    No Known Activations