INDEX
    Explanations

    instances of the word "arrests" in the text

    New Auto-Interp
    Negative Logits
    yss
    -0.68
    vironment
    -0.61
    psons
    -0.59
    reen
    -0.59
    learning
    -0.58
    Myth
    -0.58
    WE
    -0.58
     enthus
    -0.57
     Mon
    -0.57
    wb
    -0.56
    POSITIVE LOGITS
     arrests
    0.99
     arrest
    0.87
     Arrest
    0.74
     arrested
    0.73
    onto
    0.73
     decriminal
    0.69
    oppable
    0.69
     detain
    0.68
    eering
    0.68
     quotas
    0.65
    Act Density 0.009%

    No Known Activations