INDEX
    Explanations

    expressions of approval, disapproval, and commendation in the context of organizational or societal matters

    New Auto-Interp
    Negative Logits
    efe
    -0.17
    ³
    -0.17
     Spinner
    -0.15
    ean
    -0.15
    IRROR
    -0.14
    .relu
    -0.14
    lez
    -0.14
    ừ
    -0.14
    ulan
    -0.14
    ead
    -0.14
    POSITIVE LOGITS
    /owl
    0.16
    Ø·Ùģ
    0.15
    istrovstvÃŃ
    0.14
     pha
    0.14
    δά
    0.14
     åĮ
    0.14
     íĬ¹íŀĪ
    0.14
     Michaels
    0.14
     bip
    0.13
    _PF
    0.13
    Act Density 0.169%

    No Known Activations