INDEX
    Explanations

    words related to actions or experiences

    sentiments and expressions of loss or concern among people

    New Auto-Interp
    Negative Logits
    ertodd
    -0.68
    isms
    -0.60
    wise
    -0.60
    gery
    -0.59
     grin
    -0.59
     laughs
    -0.58
     Garage
    -0.58
    pex
    -0.57
     Medium
    -0.56
     reluct
    -0.56
    POSITIVE LOGITS
     themselves
    1.03
     selves
    1.03
    selves
    0.99
    atars
    0.75
     outnumbered
    0.73
     careers
    0.72
     collectively
    0.72
     THEIR
    0.70
     quotas
    0.70
     counterparts
    0.69
    Act Density 0.507%

    No Known Activations