INDEX
    Explanations

    texts instructing someone to provide information or share details

    requests for information or stories

    New Auto-Interp
    Negative Logits
    urdue
    -0.80
     ILCS
    -0.72
    rane
    -0.69
    cdn
    -0.66
    namese
    -0.64
    zinski
    -0.64
    nam
    -0.62
     elimination
    -0.60
    JV
    -0.60
    hered
    -0.59
    POSITIVE LOGITS
    tale
    1.63
    ingly
    1.19
     us
    0.90
    tell
    0.86
     tales
    0.81
     tale
    0.81
    biz
    0.78
     me
    0.78
    ously
    0.75
     tell
    0.75
    Act Density 0.056%

    No Known Activations