INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     OMIT
    -0.06
     więc
    -0.06
    -0.06
    -face
    -0.06
     resolved
    -0.06
     nosotros
    -0.06
    itates
    -0.06
    버지
    -0.06
    riority
    -0.05
    Ý
    -0.05
    POSITIVE LOGITS
    อห
    0.08
    .setColumn
    0.07
    connexion
    0.07
     Gathering
    0.07
     Crafting
    0.06
    ователь
    0.06
    μφ
    0.06
     tagging
    0.06
     power
    0.06
    ทธ
    0.06
    Act Density 0.017%

    No Known Activations