INDEX
    Explanations

    phrases related to encouragement and support

    New Auto-Interp
    Negative Logits
    id
    -0.70
    as
    -0.66
    t
    -0.63
    io
    -0.62
    i
    -0.61
     Kar
    -0.61
     Dios
    -0.59
    n
    -0.59
     off
    -0.59
    lands
    -0.58
    POSITIVE LOGITS
     encouraged
    1.89
     encourage
    1.88
     Encourage
    1.86
     encourages
    1.85
     encouragement
    1.82
    couraged
    1.79
    Encourage
    1.78
    couraging
    1.66
     encouraging
    1.65
    encouragement
    1.62
    Act Density 0.135%

    No Known Activations