INDEX
    Explanations

    expressions related to pride, work, and community objectives

    New Auto-Interp
    Negative Logits
    zung
    -0.19
    ardy
    -0.17
    ilon
    -0.17
     tar
    -0.16
    imals
    -0.15
    ìĦ
    -0.14
    pics
    -0.14
    LOAT
    -0.14
    LD
    -0.14
    cht
    -0.14
    POSITIVE LOGITS
    ollen
    0.17
    омен
    0.16
    oller
    0.14
    βο
    0.14
     Decompiled
    0.14
    SAME
    0.14
    į
    0.14
    itution
    0.14
    ãĤ§
    0.14
    ancode
    0.14
    Act Density 0.341%

    No Known Activations