INDEX
    Explanations

    words related to urging or advising actions or behaviors

    references to groups of people being encouraged or advised to take specific actions

    New Auto-Interp
    Negative Logits
    atile
    -0.67
    mys
    -0.64
    ELD
    -0.61
     Built
    -0.58
    acebook
    -0.54
     certs
    -0.53
    Rated
    -0.51
    ILA
    -0.50
    ater
    -0.50
    anka
    -0.50
    POSITIVE LOGITS
     beware
    1.08
     to
    1.01
     against
    0.92
     not
    0.90
     towards
    0.89
     toward
    0.87
    not
    0.85
     NOT
    0.84
     everywhere
    0.82
     accordingly
    0.80
    Act Density 0.135%

    No Known Activations