INDEX
    Explanations

    phrases related to weapons and shooting

    references to weapons and violent actions

    New Auto-Interp
    Negative Logits
    "],"
    -0.66
    TOP
    -0.66
     friendships
    -0.64
    udget
    -0.61
     Situation
    -0.61
    TABLE
    -0.60
     recourse
    -0.59
    ĻĤ
    -0.59
    Anonymous
    -0.59
     discussions
    -0.57
    POSITIVE LOGITS
     onto
    1.15
     toward
    1.08
     darts
    1.06
     towards
    1.05
     projectiles
    0.98
     into
    0.94
    wards
    0.94
    balls
    0.92
     forcefully
    0.89
     dart
    0.87
    Act Density 0.307%

    No Known Activations