INDEX
    Explanations

    expressions of encouragement and support for actions or behaviors

    New Auto-Interp
    Negative Logits
    id
    -0.76
    lands
    -0.75
    ("")]
    
    -0.75
    ber
    -0.65
    as
    -0.63
    }{|
    -0.63
     off
    -0.62
    io
    -0.62
    queline
    -0.62
    land
    -0.61
    POSITIVE LOGITS
     encouraged
    2.04
     encourage
    2.01
     encourages
    2.00
     Encourage
    1.97
     encouragement
    1.92
    Encourage
    1.87
    couraged
    1.82
     encouraging
    1.74
    couraging
    1.73
    encouragement
    1.73
    Act Density 0.096%

    No Known Activations