INDEX
    Explanations

    mentions of police and law enforcement-related terms

    New Auto-Interp
    Negative Logits
    оÑĤÑĮ
    -0.15
    uzey
    -0.15
    æ´¥
    -0.15
    ikip
    -0.14
    plier
    -0.14
    PLIER
    -0.13
    κÏħ
    -0.13
    ithe
    -0.13
    udes
    -0.13
    igo
    -0.13
    POSITIVE LOGITS
    ynth
    0.14
     Consum
    0.14
     Laz
    0.14
    crow
    0.13
    WithMany
    0.13
    _FORCE
    0.13
    aro
    0.13
    cheme
    0.13
    eme
    0.13
    aret
    0.13
    Act Density 0.026%

    No Known Activations