INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     filed
    -0.08
    atırım
    -0.07
    (Project
    -0.06
     dish
    -0.06
     shoots
    -0.06
     teaches
    -0.06
    udies
    -0.06
     inspires
    -0.06
    Opera
    -0.06
     basement
    -0.06
    POSITIVE LOGITS
     will
    0.08
    0.07
    029
    0.06
    unchecked
    0.06
    884
    0.06
    kemiz
    0.06
    728
    0.06
    859
    0.06
     excer
    0.06
    mys
    0.06
    Act Density 0.166%

    No Known Activations