INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     мел
    -0.07
    izr
    -0.07
     ฟร
    -0.07
    شمالی
    -0.07
    naz
    -0.06
    pol
    -0.06
    _circle
    -0.06
     retreated
    -0.06
    كار
    -0.06
    kus
    -0.06
    POSITIVE LOGITS
     Parkinson
    0.06
    .failure
    0.06
     cowork
    0.06
     UNS
    0.06
     includ
    0.06
     LOOK
    0.06
     переж
    0.06
     applause
    0.06
     anime
    0.06
     Chile
    0.06
    Act Density 0.002%

    No Known Activations