INDEX
    Explanations

    phrases indicating different physical and emotional states or activities of a person

    expressions of emotional intensity or strong feelings

    New Auto-Interp
    Negative Logits
     attRot
    -0.66
    Methods
    -0.65
    azard
    -0.65
    selves
    -0.64
     unison
    -0.64
    arser
    -0.61
     respectively
    -0.61
    TPS
    -0.58
    their
    -0.58
     jointly
    -0.57
    POSITIVE LOGITS
     himself
    1.33
     Himself
    1.10
     herself
    1.08
     his
    1.03
     tonight
    0.93
     His
    0.93
     HIS
    0.93
    his
    0.87
    His
    0.86
     Wife
    0.78
    Act Density 0.791%

    No Known Activations