INDEX
    Explanations

    words related to interactions and engagement with an audience

    phrases indicating audience acceptance or engagement

    New Auto-Interp
    Negative Logits
    WithNo
    -0.80
    ĨĴ
    -0.78
    ELD
    -0.70
    ãĥĺãĥ©
    -0.69
    ãĤ¬
    -0.67
    ãĥĩãĤ£
    -0.66
    ãĤ´
    -0.65
    itary
    -0.65
    ãĤ¦ãĤ¹
    -0.63
    ARB
    -0.62
    POSITIVE LOGITS
     approve
    1.25
     respond
    1.16
     react
    1.15
     disapprove
    1.13
     agree
    1.09
     reciproc
    1.09
     laugh
    1.07
     flock
    1.05
     notice
    1.04
     balk
    1.02
    Act Density 0.393%

    No Known Activations