INDEX
    Explanations

    references to collective actions or community statements

    New Auto-Interp
    Negative Logits
    argo
    -0.17
    reachable
    -0.15
    ylie
    -0.15
    .icons
    -0.15
     Favor
    -0.15
    elper
    -0.14
    suming
    -0.14
    plat
    -0.14
     butt
    -0.14
    AFX
    -0.14
    POSITIVE LOGITS
     couldn
    0.25
    couldn
    0.23
     Couldn
    0.23
     are
    0.20
     feel
    0.19
     proud
    0.19
     congr
    0.19
     happy
    0.18
     sÄħ
    0.18
     be
    0.18
    Act Density 0.103%

    No Known Activations