INDEX
    Explanations

    phrases indicating personal relationships

    references to friendships or relationships with individuals

    New Auto-Interp
    Negative Logits
    ItemImage
    -0.65
    essen
    -0.60
    aceous
    -0.60
    NESS
    -0.59
     assumption
    -0.58
     evaluations
    -0.58
     FANT
    -0.58
     inability
    -0.58
     Presence
    -0.58
     ];
    -0.57
    POSITIVE LOGITS
     hers
    1.15
     ours
    1.12
     yours
    1.02
     sorts
    1.00
     theirs
    0.98
     mine
    0.91
    ammad
    0.72
     course
    0.71
    irlf
    0.69
    hire
    0.66
    Act Density 0.102%

    No Known Activations