INDEX
    Explanations

    words related to negative actions or events, including scandals and violence

    negative terms or references

    New Auto-Interp
    Negative Logits
     Dickinson
    -0.73
    ulhu
    -0.71
     Norris
    -0.61
     Rica
    -0.58
     AVG
    -0.55
     Gunn
    -0.55
     Arabian
    -0.55
     Burr
    -0.54
     Richards
    -0.53
    kson
    -0.51
    POSITIVE LOGITS
    sized
    1.01
    based
    1.01
    level
    0.91
    themed
    0.88
    to
    0.86
    advertising
    0.86
    style
    0.86
    friendly
    0.83
    centric
    0.82
    oriented
    0.82
    Act Density 0.359%

    No Known Activations