INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    itlement
    -0.06
     الإن
    -0.06
    іє
    -0.06
     sûr
    -0.06
    เบ
    -0.06
     unfair
    -0.06
    ์แ
    -0.06
    _PRESS
    -0.06
     propName
    -0.06
     unreliable
    -0.06
    POSITIVE LOGITS
     narrowed
    0.08
     qualified
    0.07
    
    0.07
    qualified
    0.07
    -result
    0.06
     narrowing
    0.06
     tailored
    0.06
     kavram
    0.06
    izzard
    0.06
     beginning
    0.06
    Act Density 0.006%

    No Known Activations