INDEX
    Explanations

    references to law enforcement and violent incidents

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.05
    3:0.04
    4:0.05
    5:0.04
    6:0.37
    7:0.05
    8:0.03
    9:0.03
    10:0.13
    11:0.10
    Negative Logits
    heit
    -1.26
    uous
    -1.22
    IRD
    -1.20
    JV
    -1.18
    ¯¯
    -1.17
    EW
    -1.17
     Norn
    -1.15
     Latter
    -1.14
    catentry
    -1.12
    invoke
    -1.12
    POSITIVE LOGITS
    anamo
    1.72
    abwe
    1.53
     裏�
    1.51
    accompan
    1.49
    obin
    1.45
    EStream
    1.44
     destro
    1.42
    ibaba
    1.41
     Sparks
    1.40
     eleph
    1.36
    Act Density 0.022%

    No Known Activations