INDEX
    Explanations

    phrases indicating a reminder or a note

    phrases that signify reminders or important notes

    New Auto-Interp
    Negative Logits
    Americ
    -0.68
    gt
    -0.68
    luaj
    -0.66
    Hon
    -0.66
    anooga
    -0.65
    Hide
    -0.64
    uries
    -0.64
    rill
    -0.64
     Helpful
    -0.63
    Gordon
    -0.63
    POSITIVE LOGITS
     ONLY
    0.98
     NEVER
    0.92
     ALSO
    0.90
     BEFORE
    0.82
     NOT
    0.82
     DID
    0.81
     ALWAYS
    0.81
     DOES
    0.80
     MUCH
    0.77
     actually
    0.77
    Act Density 0.582%

    No Known Activations