INDEX
    Explanations

    texts related to making promises or commitments

    words related to urging, requesting, or promoting actions

    New Auto-Interp
    Negative Logits
     Craw
    -0.68
    hack
    -0.66
    æŃ¦
    -0.64
    Poké
    -0.63
     Hollow
    -0.61
     Dortmund
    -0.60
     Dug
    -0.60
     Oval
    -0.59
    Buzz
    -0.58
    ynam
    -0.56
    POSITIVE LOGITS
    agree
    0.84
    iment
    0.82
     beware
    0.80
     mercy
    0.79
    haps
    0.79
    iments
    0.79
    ingly
    0.78
     modesty
    0.78
    soever
    0.77
     waive
    0.77
    Act Density 0.252%

    No Known Activations