INDEX
    Explanations

    phrases indicating desire or preference

    expressions of desire or intent

    New Auto-Interp
    Negative Logits
    errors
    -0.69
    livious
    -0.68
     enthusi
    -0.66
    iky
    -0.63
    EStreamFrame
    -0.63
    mitter
    -0.62
    squ
    -0.62
    dq
    -0.61
    Dro
    -0.60
    depend
    -0.60
    POSITIVE LOGITS
     sake
    0.71
    awaru
    0.70
    reprene
    0.69
     purposes
    0.69
     fuller
    0.66
     better
    0.66
     anything
    0.64
    acion
    0.63
    Continue
    0.63
     succeed
    0.62
    Act Density 0.059%

    No Known Activations