INDEX
    Explanations

    words related to criticism or disagreement

    phrases related to being criticized or deemed inappropriate

    New Auto-Interp
    Negative Logits
    ositories
    -0.68
    iosyncr
    -0.64
    Jump
    -0.64
    Query
    -0.63
    Impro
    -0.63
    Roberts
    -0.62
     strand
    -0.62
    orgetown
    -0.62
     rub
    -0.61
    dfx
    -0.61
    POSITIVE LOGITS
    alls
    0.88
    ength
    0.85
    ogue
    0.84
    arding
    0.83
    arded
    0.83
    worth
    0.82
    ares
    0.78
     999
    0.78
    alled
    0.77
    BuyableInstoreAndOnline
    0.77
    Act Density 0.009%

    No Known Activations