INDEX
    Explanations

    instances where someone reaches for something

    phrases that include the word "for."

    New Auto-Interp
    Negative Logits
    sonian
    -0.66
    MK
    -0.59
    properties
    -0.58
    CLOSE
    -0.57
    DOWN
    -0.57
     Sloan
    -0.57
    soever
    -0.56
    ashtra
    -0.54
     Notting
    -0.54
    heart
    -0.53
    POSITIVE LOGITS
    bidden
    1.03
    gotten
    0.90
    geries
    0.90
     instance
    0.86
    gery
    0.86
     example
    0.81
    aging
    0.76
    ked
    0.75
     awhile
    0.74
    cing
    0.73
    Act Density 0.164%

    No Known Activations