INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Silk
    -0.07
     patent
    -0.07
    Marks
    -0.07
    ху
    -0.07
     مك
    -0.06
    ười
    -0.06
    regs
    -0.06
    avic
    -0.06
     جن
    -0.06
    وری
    -0.06
    POSITIVE LOGITS
     push
    0.09
     лов
    0.07
    nehmen
    0.06
    Push
    0.06
    -push
    0.06
    .Reporting
    0.06
    0.06
     Push
    0.06
     nắng
    0.06
    .labels
    0.06
    Act Density 0.006%

    No Known Activations