INDEX
    Explanations

    specific intentions or desires expressed by the speaker

    expressions of desire or wishes

    New Auto-Interp
    Negative Logits
    erver
    -0.72
    manship
    -0.72
    iop
    -0.70
    cript
    -0.69
    cheat
    -0.68
    mir
    -0.67
    lish
    -0.66
    wikipedia
    -0.65
    eding
    -0.65
    mans
    -0.65
    POSITIVE LOGITS
    reprene
    0.70
    rison
    0.67
     warr
    0.66
    ãĤ¦ãĤ¹
    0.65
     inspiration
    0.65
    urities
    0.63
     permission
    0.62
    ipal
    0.62
    nesday
    0.61
     revenge
    0.60
    Act Density 0.061%

    No Known Activations