INDEX
    Explanations

    references to police-related incidents or discussions

    New Auto-Interp
    Negative Logits
     pitch
    -0.15
    pitch
    -0.15
    arty
    -0.15
     Pitt
    -0.15
     Peters
    -0.14
     Hatch
    -0.14
     lit
    -0.14
     von
    -0.14
     vom
    -0.14
     Ernst
    -0.14
    POSITIVE LOGITS
    boru
    0.16
    /cop
    0.15
    liš
    0.14
    uka
    0.14
    าà¸ģ
    0.14
    å·¡
    0.14
    YPD
    0.14
    udence
    0.14
    ieber
    0.14
    ovol
    0.14
    Act Density 0.065%

    No Known Activations