INDEX
    Explanations

    verbs that involve halting or preventing something

    repeated calls to take action or halt negative behaviors

    New Auto-Interp
    Negative Logits
    Sov
    -0.82
    ammy
    -0.77
    ighth
    -0.75
    olesc
    -0.74
    ault
    -0.71
    aths
    -0.69
    ety
    -0.69
    VERTISEMENT
    -0.69
    uth
    -0.67
    orthy
    -0.67
    POSITIVE LOGITS
    gap
    0.95
    watching
    0.93
    watch
    0.93
     bothering
    0.86
     stopping
    0.76
     breathing
    0.75
     smoking
    0.75
    reon
    0.75
     blinking
    0.73
    door
    0.71
    Act Density 0.025%

    No Known Activations