INDEX
    Explanations

    potential motives and triggers for violent or controversial actions

    themes related to violence and its underlying motives

    New Auto-Interp
    Negative Logits
    anwhile
    -0.56
    ajor
    -0.53
    "!
    -0.53
    Marg
    -0.52
    lishes
    -0.52
    .}
    -0.49
    aut
    -0.49
    ngth
    -0.48
    oya
    -0.48
    Morning
    -0.48
    POSITIVE LOGITS
    ?,
    1.14
    ,[
    1.01
    *,
    1.00
     (),
    0.95
    /,
    0.86
    !,
    0.82
     ,
    0.82
    $,
    0.81
    ,...
    0.79
    +,
    0.79
    Act Density 1.228%

    No Known Activations