INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ACTER
    -0.07
     одного
    -0.06
    ts
    -0.06
     ген
    -0.06
    rij
    -0.06
    accel
    -0.06
     подоб
    -0.06
     کد
    -0.06
     आद
    -0.06
     informat
    -0.06
    POSITIVE LOGITS
     work
    0.13
     Work
    0.11
    Work
    0.10
    work
    0.08
     práci
    0.08
    -work
    0.08
     WORK
    0.07
    WORK
    0.07
     groundwork
    0.07
    biology
    0.07
    Act Density 0.031%

    No Known Activations