INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    	last
    -0.07
    apos
    -0.07
    606
    -0.06
    registro
    -0.06
    department
    -0.06
    jem
    -0.06
    かる
    -0.06
     borderline
    -0.06
    (emp
    -0.06
    (Y
    -0.06
    POSITIVE LOGITS
    0.07
     Panthers
    0.07
     хорош
    0.06
    training
    0.06
     ActiveSupport
    0.06
     Reds
    0.06
    فة
    0.06
     Aud
    0.06
    听到
    0.06
     kul
    0.06
    Act Density 0.004%

    No Known Activations