INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     l
    -0.07
     kernel
    -0.07
     Arabic
    -0.06
     coined
    -0.06
    .mime
    -0.06
    らせ
    -0.06
    leshoot
    -0.06
     있다는
    -0.06
    Spring
    -0.06
     rooted
    -0.06
    POSITIVE LOGITS
     By
    0.30
    (By
    0.09
     BY
    0.08
    .By
    0.07
    BY
    0.07
     Ashe
    0.06
    _BY
    0.06
    .byId
    0.06
    .department
    0.06
     For
    0.06
    Act Density 0.011%

    No Known Activations