INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    بية
    -0.07
     pek
    -0.06
    -0.06
    _SIM
    -0.06
    люб
    -0.06
    ../../
    -0.06
     pant
    -0.05
    nt
    -0.05
     oneself
    -0.05
     spinal
    -0.05
    POSITIVE LOGITS
     відріз
    0.08
    Listen
    0.07
    .background
    0.07
    modal
    0.07
    ุด
    0.07
    PostBack
    0.07
    _weight
    0.07
    VERTISE
    0.07
    0.06
    랍니다
    0.06
    Act Density 0.004%

    No Known Activations