INDEX
    Explanations

    references to inmates or prison-related terms

    mentions of inmates and corrections-related terms

    New Auto-Interp
    Negative Logits
    rous
    -0.73
    ãĤ¤ãĥĪ
    -0.71
    ãĥ¼ãĥ³
    -0.70
    ku
    -0.68
    drive
    -0.67
    ãĥ¼ãĥ«
    -0.67
     Nare
    -0.67
    orically
    -0.66
    efully
    -0.66
    issan
    -0.66
    POSITIVE LOGITS
     inmates
    0.93
     inmate
    0.92
    icts
    0.91
    iaries
    0.90
    iary
    0.85
     Facility
    0.75
     incarcerated
    0.72
     Correctional
    0.72
     correctional
    0.70
    arians
    0.70
    Act Density 0.026%

    No Known Activations