INDEX
    Explanations

    references to reminders and self-reflection

    New Auto-Interp
    Negative Logits
    brtc
    -0.76
    easeOut
    -0.68
    EXPERIMENTAL
    -0.67
    oplan
    -0.66
    ضب
    -0.66
    crose
    -0.65
    Phương
    -0.64
    colgroup
    -0.64
    ypso
    -0.63
    chesse
    -0.62
    POSITIVE LOGITS
     remind
    2.20
     reminded
    2.15
     reminder
    2.09
     reminding
    2.08
     reminders
    2.04
     reminds
    2.02
    remind
    1.92
     Reminders
    1.88
     Remind
    1.86
    Remind
    1.85
    Act Density 0.053%

    No Known Activations