INDEX
    Explanations

    references to violent crimes and the individuals involved in them

    New Auto-Interp
    Negative Logits
    ipp
    -0.15
    essler
    -0.15
     violence
    -0.14
    idable
    -0.14
     viol
    -0.14
    ãĥ¼ãĥ«ãĥī
    -0.14
     bey
    -0.13
    eza
    -0.13
    .qq
    -0.13
    .toBe
    -0.13
    POSITIVE LOGITS
     Wich
    0.19
     Pvt
    0.18
    pong
    0.17
    .cf
    0.17
    ÑĢок
    0.16
    Mgr
    0.15
    undles
    0.15
    วรร
    0.15
    ritch
    0.14
    apsed
    0.14
    Act Density 0.248%

    No Known Activations