INDEX
    Explanations

    various references to threats and dangers in social, political, and environmental contexts

    New Auto-Interp
    Negative Logits
    iao
    -0.21
    artin
    -0.17
    ocker
    -0.15
    Forbidden
    -0.15
    ocket
    -0.15
    quine
    -0.14
    ecided
    -0.14
    Shock
    -0.14
    .pixel
    -0.14
    ctal
    -0.14
    POSITIVE LOGITS
     posed
    0.40
     Pos
    0.29
    posed
    0.25
     pos
    0.23
     facing
    0.22
     faced
    0.22
    ened
    0.22
    ening
    0.22
     pose
    0.22
     assessment
    0.21
    Act Density 0.042%

    No Known Activations