INDEX
    Explanations

    phrases indicating desire or preference

    expressions of desire or need

    New Auto-Interp
    Negative Logits
     Notting
    -0.66
    ohyd
    -0.65
    osterone
    -0.65
    idelines
    -0.63
    iel
    -0.63
    estinal
    -0.62
    strate
    -0.62
    wald
    -0.62
    ormonal
    -0.61
    roach
    -0.60
    POSITIVE LOGITS
     sake
    0.82
     forgiveness
    0.73
     realism
    0.71
     revenge
    0.67
     apocalypse
    0.65
     attention
    0.64
     panties
    0.64
     clarity
    0.64
     haircut
    0.63
     daddy
    0.63
    Act Density 0.128%

    No Known Activations