INDEX
    Explanations

    phrases or words that convey positive attributes or actions

    terminology related to positive outcomes or effects

    New Auto-Interp
    Negative Logits
    appings
    -0.81
    ptin
    -0.80
    oths
    -0.79
    puter
    -0.79
    adr
    -0.78
    conservancy
    -0.76
    neys
    -0.74
    ngth
    -0.73
    RAW
    -0.73
    arers
    -0.72
    POSITIVE LOGITS
     reinforcement
    1.05
     affirm
    0.92
     affirmation
    0.90
     feedback
    0.84
     outcome
    0.83
     vib
    0.83
     positive
    0.80
     portrayal
    0.79
     outlook
    0.77
     appraisal
    0.76
    Act Density 0.028%

    No Known Activations