INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    03
    -0.08
    Prod
    -0.07
    07
    -0.07
    02
    -0.06
     Ulus
    -0.06
    .artist
    -0.06
     pad
    -0.06
     NAFTA
    -0.06
    -mode
    -0.06
    ircle
    -0.06
    POSITIVE LOGITS
     cervical
    0.10
    ving
    0.08
     cerv
    0.08
    _RCC
    0.06
     heterosexual
    0.06
    .kafka
    0.06
    avy
    0.06
     TValue
    0.06
     ،
    0.06
    เวอร
    0.06
    Act Density 0.002%

    No Known Activations