INDEX
    Explanations

    phrases indicating desires or intentions

    expressions of desire or intent

    New Auto-Interp
    Negative Logits
    pite
    -0.80
     muster
    -0.74
    rir
    -0.69
    recomm
    -0.67
     yielding
    -0.63
    render
    -0.62
     permitting
    -0.62
     Cosponsors
    -0.62
    ously
    -0.61
     attempting
    -0.61
    POSITIVE LOGITS
     someday
    0.93
     ASAP
    0.82
     louder
    0.78
    cool
    0.68
     sooner
    0.68
     revenge
    0.67
    reprene
    0.66
     daddy
    0.65
     rid
    0.65
     bigger
    0.65
    Act Density 0.322%

    No Known Activations