INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
    024
    -0.06
     Kam
    -0.06
    Canon
    -0.06
     GK
    -0.06
     Func
    -0.06
     {}'.
    -0.06
     dubna
    -0.06
    -0.06
     اقدام
    -0.06
    POSITIVE LOGITS
    (class
    0.07
    meden
    0.07
     magazine
    0.07
    _pcm
    0.07
    lili
    0.06
    .Restrict
    0.06
     neighborhood
    0.06
    unning
    0.06
    _kernel
    0.06
    cxx
    0.06
    Act Density 0.001%

    No Known Activations