INDEX
    Explanations

    words related to criminal activities such as mugging and assault

    references to mugging incidents or related violent acts

    New Auto-Interp
    Negative Logits
    ×Ļ×
    -0.75
    Domin
    -0.71
    edient
    -0.71
    IGH
    -0.70
     Doctrine
    -0.69
    ipher
    -0.68
    peak
    -0.68
    ISION
    -0.68
    cision
    -0.66
     Virgin
    -0.66
    POSITIVE LOGITS
     mug
    1.17
    gers
    1.10
    ging
    1.00
    shots
    0.99
    ger
    0.96
    shot
    0.94
    atures
    0.87
    ged
    0.86
     Mug
    0.82
    glers
    0.81
    Act Density 0.007%

    No Known Activations