INDEX
    Explanations

    words related to legal or criminal activities

    New Auto-Interp
    Negative Logits
    PRES
    -0.76
    hyde
    -0.74
    âĸ¬
    -0.74
    Madison
    -0.73
    gerald
    -0.71
    NEY
    -0.70
    lish
    -0.68
    BOOK
    -0.67
    LY
    -0.67
    SPONSORED
    -0.67
    POSITIVE LOGITS
    atter
    1.26
    aques
    1.23
    umber
    1.22
    umbers
    1.20
    acer
    1.15
    iers
    1.14
    umb
    1.12
    acent
    1.10
    ump
    1.08
    asma
    1.06
    Act Density 0.014%

    No Known Activations