INDEX
    Explanations

    activation phrases related to requesting a user's attention

    references to personal or collective perspective, particularly the use of "me" and "us"

    New Auto-Interp
    Negative Logits
    iaries
    -0.61
     suicides
    -0.60
    bidden
    -0.59
    _.
    -0.56
    ibrary
    -0.55
    aks
    -0.54
    ranch
    -0.54
    icion
    -0.53
     fragmentation
    -0.52
    hedon
    -0.52
    POSITIVE LOGITS
    rina
    0.65
    adow
    0.64
    Widget
    0.63
     Thumbnails
    0.60
    img
    0.59
    azz
    0.56
    ocol
    0.55
    tle
    0.55
    borough
    0.55
    dam
    0.54
    Act Density 0.019%

    No Known Activations