INDEX
    Explanations

    phrases related to violent actions, especially targeted at prominent figures

    instances of the word "assassination" and related terms

    New Auto-Interp
    Negative Logits
    bos
    -0.85
    orders
    -0.73
    rl
    -0.71
    hist
    -0.70
    ulia
    -0.69
    Commerce
    -0.68
    fm
    -0.67
     Toys
    -0.66
    È
    -0.66
     Seah
    -0.65
    POSITIVE LOGITS
     assassination
    1.00
     assassinated
    0.98
     assass
    0.98
     assassins
    0.97
     assassin
    0.93
     assassinate
    0.90
     Assass
    0.87
     slit
    0.86
     plots
    0.74
     satir
    0.73
    Act Density 0.078%

    No Known Activations