INDEX
    Explanations

    comparisons or interactions between individuals in a social setting

    descriptions of personal relationships and social interactions

    New Auto-Interp
    Negative Logits
    etheless
    -0.71
    moil
    -0.70
    ornia
    -0.65
    sequent
    -0.65
    allel
    -0.63
    alm
    -0.62
     Results
    -0.62
    iren
    -0.61
    arthed
    -0.61
    gran
    -0.61
    POSITIVE LOGITS
     himself
    0.85
     remorse
    0.79
     fuckin
    0.78
     pacing
    0.73
     me
    0.72
     terrific
    0.72
     asleep
    0.70
     abras
    0.68
     tyr
    0.68
     unbelievable
    0.68
    Act Density 0.628%

    No Known Activations