INDEX
    Explanations

    sentences expressing experiences or accomplishments

    positive expressions related to personal experiences and achievements

    New Auto-Interp
    Negative Logits
    erv
    -0.70
    reference
    -0.65
    ague
    -0.65
    ifice
    -0.58
    azard
    -0.58
    abus
    -0.58
    orno
    -0.58
    ucc
    -0.57
    oof
    -0.57
    farious
    -0.57
    POSITIVE LOGITS
     improved
    0.94
     wonderful
    0.88
     rewarded
    0.86
     revital
    0.85
     amazing
    0.85
     rejuven
    0.85
     marvelous
    0.83
     thank
    0.83
     splendid
    0.83
     outper
    0.82
    Act Density 1.586%

    No Known Activations