INDEX
    Explanations

    references to friends and family in the text

    references to friends and social connections

    New Auto-Interp
    Negative Logits
    yss
    -0.69
    qqa
    -0.65
    oted
    -0.63
    tarians
    -0.60
     chloride
    -0.59
    acco
    -0.59
    ocalypse
    -0.59
     Cout
    -0.58
    secution
    -0.58
     seizure
    -0.58
    POSITIVE LOGITS
    hips
    1.07
    lier
    0.95
     acquaintances
    0.94
    hip
    0.94
     friends
    0.83
     collaborators
    0.81
    Friends
    0.79
     colleagues
    0.78
    friends
    0.78
     acquaintance
    0.78
    Act Density 0.050%

    No Known Activations