INDEX
    Explanations

    phrases related to willingness to do something

    expressions of willingness or commitment to action

    New Auto-Interp
    Negative Logits
    hemy
    -0.88
     Anthem
    -0.83
    adish
    -0.76
    loo
    -0.76
    oche
    -0.74
    alien
    -0.73
    arette
    -0.73
    onut
    -0.73
    rx
    -0.72
    riot
    -0.68
    POSITIVE LOGITS
    theless
    0.81
     willing
    0.79
     gladly
    0.77
     enough
    0.76
    terday
    0.75
     willingly
    0.73
     unres
    0.72
     uncond
    0.69
     to
    0.68
     accept
    0.67
    Act Density 0.031%

    No Known Activations