INDEX
    Explanations

    names of political or public figures

    references to specific individuals or names

    New Auto-Interp
    Negative Logits
    Drop
    -0.67
    BW
    -0.65
     Slaughter
    -0.64
     Ther
    -0.64
     Rath
    -0.63
     stalls
    -0.62
     Fancy
    -0.61
     Suicide
    -0.61
    Sov
    -0.61
     Stall
    -0.61
    POSITIVE LOGITS
    ennis
    3.20
    ribune
    1.70
    anish
    1.69
    kick
    1.58
    erek
    1.52
    aniel
    1.33
    iscovery
    1.27
    ENN
    1.26
    ampa
    1.24
    ixon
    1.21
    Act Density 0.040%

    No Known Activations