INDEX
    Explanations

    phrases expressing requests or encouragement for engagement and support

    New Auto-Interp
    Negative Logits
     nonetheless
    -0.66
     hindsight
    -0.65
    glers
    -0.63
     runes
    -0.62
    panic
    -0.62
     Sapp
    -0.61
    iter
    -0.61
     paraph
    -0.60
    reek
    -0.59
    clusions
    -0.59
    POSITIVE LOGITS
    bleacher
    0.66
    ACY
    0.65
    vice
    0.62
    iphate
    0.59
    yll
    0.57
    atered
    0.57
     wellbeing
    0.56
    cellence
    0.56
    ONSORED
    0.55
    Ĵ
    0.54
    Act Density 0.018%

    No Known Activations