INDEX
    Explanations

    phrases related to feelings of happiness and satisfaction

    expressions of joy and delight

    New Auto-Interp
    Negative Logits
    eworld
    -0.78
    alcohol
    -0.76
    organic
    -0.70
    vernment
    -0.67
    ifted
    -0.67
     downed
    -0.65
    helle
    -0.63
    xia
    -0.63
     Rhod
    -0.62
    otypes
    -0.62
    POSITIVE LOGITS
     delight
    1.24
    fully
    1.12
     pleasure
    0.95
    urous
    0.93
    iously
    0.91
     aston
    0.90
    ously
    0.89
    fulness
    0.86
    ishly
    0.85
    theless
    0.84
    Act Density 0.006%

    No Known Activations