INDEX
    Explanations

    phrases suggesting caution or the potential for negative consequences

    Follows "don't" or "not"

    New Auto-Interp
    Negative Logits
    shund
    -0.55
    -0.49
     "..\..\..\
    -0.47
     "..\..\
    -0.47
     doGet
    -0.46
     benefitted
    -0.46
    νον
    -0.46
     Nimbus
    -0.46
    ماه
    -0.45
     indisponible
    -0.45
    POSITIVE LOGITS
     forget
    1.10
     worry
    1.02
     Forget
    0.77
     Worry
    0.75
     fret
    0.74
    Forget
    0.74
    forget
    0.72
     panic
    0.72
    worry
    0.69
     you
    0.67
    Act Density 0.086%

    No Known Activations