INDEX
    Explanations

    phrases related to theft or illegal actions

    New Auto-Interp
    Negative Logits
     Cosponsors
    -0.88
    REDACTED
    -0.78
    auga
    -0.74
    elist
    -0.73
    âĸ¬
    -0.66
    worldly
    -0.66
    Seg
    -0.65
     pecul
    -0.65
     Kislyak
    -0.64
    Reviewer
    -0.64
    POSITIVE LOGITS
    ows
    1.26
    oried
    1.24
    aging
    1.11
    agra
    1.00
    ager
    0.99
    ard
    0.97
    ayers
    0.97
    nut
    0.95
    agers
    0.93
    boxes
    0.91
    Act Density 0.039%

    No Known Activations