INDEX
    Explanations

    references to the concept of "prison"

    New Auto-Interp
    Negative Logits
    lass
    -0.74
    issan
    -0.69
    udden
    -0.69
    thora
    -0.68
    oric
    -0.67
     Bundes
    -0.66
    laus
    -0.66
    yip
    -0.65
    ///
    -0.63
    idy
    -0.61
    POSITIVE LOGITS
     prisons
    0.93
    prison
    0.93
     inmates
    0.92
     prison
    0.92
     inmate
    0.85
     barr
    0.82
     confinement
    0.82
     jail
    0.81
     incarcer
    0.80
     sentences
    0.80
    Act Density 0.019%

    No Known Activations