INDEX
    Explanations

    phrases related to criticism and potential negative consequences

    phrases that highlight societal issues and their impacts

    New Auto-Interp
    Negative Logits
     Outbreak
    -0.72
     Mayhem
    -0.69
     scenarios
    -0.68
     Problems
    -0.65
    detail
    -0.64
     Assignment
    -0.64
     Incident
    -0.62
     incidents
    -0.62
     [+
    -0.62
    emi
    -0.62
    POSITIVE LOGITS
     cherished
    1.29
     bedrock
    1.07
     vital
    1.03
     pillars
    1.02
     cornerstone
    1.00
     livelihood
    1.00
     pillar
    0.99
     precious
    0.99
    essential
    0.98
     decency
    0.97
    Act Density 0.514%

    No Known Activations