INDEX
    Explanations

    instances of phrasing related to actions or recommendations involving "you" and "to."

    New Auto-Interp
    Negative Logits
    Doing
    -0.23
     Doing
    -0.21
    DONE
    -0.17
    doing
    -0.17
    Done
    -0.17
    /do
    -0.17
    esModule
    -0.16
    rez
    -0.16
    VERRIDE
    -0.15
    done
    -0.15
    POSITIVE LOGITS
     di
    0.23
    -d
    0.22
    .d
    0.22
     d
    0.21
     dose
    0.20
     due
    0.20
     du
    0.20
     dot
    0.20
     dee
    0.20
     does
    0.20
    Act Density 0.089%

    No Known Activations