INDEX
    Explanations

    mentions of actions related to theft or unauthorized taking

    instances of the word "steal" and its variations

    New Auto-Interp
    Negative Logits
    present
    -0.74
    band
    -0.73
    anamo
    -0.71
    pora
    -0.70
    olver
    -0.70
    night
    -0.69
    ichick
    -0.67
    bands
    -0.65
    rehens
    -0.64
    acerb
    -0.63
    POSITIVE LOGITS
     glances
    0.87
    weed
    0.82
     stolen
    0.77
    ster
    0.76
     stealing
    0.73
     prey
    0.71
    sters
    0.71
     away
    0.70
    ezvous
    0.69
     steals
    0.69
    Act Density 0.019%

    No Known Activations