INDEX
    Explanations

    words related to law enforcement actions and legal terminology

    New Auto-Interp
    Negative Logits
    lihood
    -0.72
    BSD
    -0.71
    intendent
    -0.66
     beware
    -0.64
    manship
    -0.63
    terday
    -0.63
     bold
    -0.62
     taunt
    -0.62
     tom
    -0.62
    nown
    -0.62
    POSITIVE LOGITS
    achable
    1.24
    ention
    1.11
    rans
    1.10
    ainer
    1.09
    roit
    1.06
    ailed
    1.06
    uned
    1.02
    ainers
    0.98
    rit
    0.98
    rag
    0.98
    Act Density 0.029%

    No Known Activations