INDEX
    Explanations

    phrases related to public issues or societal impact

    references to public safety and the impact of societal issues

    New Auto-Interp
    Negative Logits
    REDACTED
    -0.77
    wcsstore
    -0.58
    ITNESS
    -0.55
     guiActive
    -0.53
    OLOG
    -0.52
     caveats
    -0.51
    onyms
    -0.50
    VIDIA
    -0.50
    fried
    -0.49
    ALT
    -0.49
    POSITIVE LOGITS
     unnecessarily
    0.69
     sensibilities
    0.66
     downstream
    0.64
     exponentially
    0.60
     morale
    0.60
     prematurely
    0.59
     goose
    0.59
     tremendously
    0.58
     itch
    0.57
     competitiveness
    0.56
    Act Density 0.773%

    No Known Activations