INDEX
    Explanations

    words related to physical movement or actions

    New Auto-Interp
    Negative Logits
    kek
    -0.19
    InBackground
    -0.17
    esson
    -0.16
    kenin
    -0.15
    emax
    -0.15
    ÅĤe
    -0.14
    ooter
    -0.14
    åīĤ
    -0.14
    лади
    -0.14
    bine
    -0.14
    POSITIVE LOGITS
    ysis
    0.19
    yg
    0.17
    asel
    0.17
    é϶
    0.17
    olia
    0.16
    yer
    0.15
    DP
    0.14
    ierz
    0.14
    ench
    0.14
    utor
    0.14
    Act Density 0.023%

    No Known Activations