INDEX
    Explanations

    phrases related to legal actions and consequences

    words associated with violence and aggressive actions

    New Auto-Interp
    Negative Logits
    zbollah
    -0.82
    anguage
    -0.63
    iae
    -0.59
    pora
    -0.59
    ather
    -0.55
    '/
    -0.55
    kie
    -0.52
     condem
    -0.51
    kaya
    -0.51
    schild
    -0.50
    POSITIVE LOGITS
     him
    2.82
    him
    2.06
     HIM
    1.77
     Him
    1.75
     his
    1.62
    his
    1.53
    His
    1.46
    He
    1.39
     he
    1.27
     His
    1.20
    Act Density 1.004%

    No Known Activations