INDEX
    Explanations

    topics related to violence and its impact on individuals, particularly women

    New Auto-Interp
    Negative Logits
    REDACTED
    -0.62
     Spawn
    -0.60
     Hydra
    -0.59
     Lethal
    -0.58
     Diver
    -0.58
    UTC
    -0.57
     Dmit
    -0.57
     Seller
    -0.56
    Shares
    -0.56
     Hacker
    -0.55
    POSITIVE LOGITS
    their
    1.26
    selves
    1.08
     themselves
    1.03
     their
    1.00
    Their
    0.91
     THEIR
    0.87
    entimes
    0.86
    they
    0.84
     Their
    0.83
    They
    0.80
    Act Density 4.631%

    No Known Activations