INDEX
    Explanations

    terms related to encouragement and support for action

    New Auto-Interp
    Negative Logits
    id
    -0.77
    as
    -0.71
    ber
    -0.69
    lands
    -0.68
    }{|
    -0.67
    io
    -0.66
    fe
    -0.64
    bed
    -0.63
    p
    -0.62
     Dios
    -0.61
    POSITIVE LOGITS
     encouraged
    1.74
     encourages
    1.71
     encourage
    1.67
     Encourage
    1.67
    couraged
    1.63
    Encourage
    1.60
     encouragement
    1.60
     encor
    1.43
     encourag
    1.42
    couraging
    1.41
    Act Density 0.096%

    No Known Activations