INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    erial
    -0.07
     طبی
    -0.06
     uống
    -0.06
    ipe
    -0.06
    ikh
    -0.06
     Phon
    -0.06
     نسب
    -0.06
    _related
    -0.06
     pirates
    -0.06
    jack
    -0.06
    POSITIVE LOGITS
    �合
    0.07
    ='./
    0.07
     slut
    0.06
    (fin
    0.06
    0.06
     proclamation
    0.06
     العمل
    0.06
     신규
    0.06
    /person
    0.06
    정이
    0.06
    Act Density 0.215%

    No Known Activations