INDEX
    Explanations

    requests or calls to action within text

    the word "please" in various contexts

    New Auto-Interp
    Negative Logits
    teenth
    -0.68
    anes
    -0.58
    raft
    -0.58
    lier
    -0.55
    oba
    -0.55
    roup
    -0.53
     Vec
    -0.53
    ao
    -0.53
    eways
    -0.52
     waged
    -0.51
    POSITIVE LOGITS
     please
    3.62
    please
    2.63
     PLEASE
    2.62
     Please
    1.97
    Please
    1.64
     beware
    1.32
     kindly
    1.14
     THANK
    1.04
     sorry
    1.02
     thank
    1.01
    Act Density 0.012%

    No Known Activations