INDEX
    Explanations

    phrases related to physical violence and injury

    New Auto-Interp
    Negative Logits
    ponential
    -0.16
    noch
    -0.16
     Consortium
    -0.16
    atte
    -0.15
    665
    -0.15
    arness
    -0.14
    neau
    -0.14
    uum
    -0.14
     polar
    -0.13
     colore
    -0.13
    POSITIVE LOGITS
    CHASE
    0.16
     unprotected
    0.15
    ós
    0.15
    tons
    0.15
    hay
    0.15
    -sensitive
    0.15
     sensitive
    0.14
     head
    0.14
     hay
    0.14
    ungan
    0.14
    Act Density 0.028%

    No Known Activations