INDEX
    Explanations

    words related to disapproval or criticism

    words related to public relations or promotional content

    New Auto-Interp
    Negative Logits
     Sacrament
    -0.73
     Winds
    -0.69
    BILITY
    -0.67
    ORED
    -0.63
     persistence
    -0.63
     Eagle
    -0.63
     Royale
    -0.62
    born
    -0.62
     Americas
    -0.61
    croft
    -0.60
    POSITIVE LOGITS
    imate
    1.12
    uning
    1.10
    imes
    1.08
    arians
    1.06
    ices
    0.97
    atical
    0.96
    icy
    0.94
    ams
    0.94
    icks
    0.92
    asion
    0.92
    Act Density 0.011%

    No Known Activations