INDEX
    Explanations

    positive outcomes or achievements

    expressions of quality and value related to actions or outcomes

    New Auto-Interp
    Negative Logits
    anwhile
    -0.84
     Strauss
    -0.68
    wx
    -0.62
     horizont
    -0.61
    rought
    -0.60
     makers
    -0.60
    ioxide
    -0.60
     Creator
    -0.59
    urgical
    -0.59
    ennes
    -0.59
    POSITIVE LOGITS
    sense
    0.79
     impression
    0.77
     contributions
    0.72
     contribution
    0.71
     dent
    0.70
    Ò
    0.69
    commit
    0.68
     headlines
    0.68
     strides
    0.67
    TEXT
    0.66
    Act Density 0.116%

    No Known Activations