INDEX
    Explanations

    terms related to rewards, recognition, and incentives

    New Auto-Interp
    Negative Logits
    LookAnd
    -0.79
    anskje
    -0.61
     Walkover
    -0.59
    UIControlState
    -0.59
    RSpec
    -0.59
    quiries
    -0.58
    WithMany
    -0.58
     الحره
    -0.56
     døde
    -0.55
    Enlaces
    -0.54
    POSITIVE LOGITS
     reward
    2.47
     rewards
    2.33
     Reward
    2.23
     Rewards
    2.15
     rewarded
    2.05
    reward
    2.04
    Reward
    2.00
     rewarding
    1.82
    Rewards
    1.79
     récompense
    1.75
    Act Density 0.158%

    No Known Activations