INDEX
    Explanations

    terms related to positive experiences and beneficial effects

    favorable effects on outcomes

    New Auto-Interp
    Negative Logits
     defaultstate
    -0.38
    chien
    -0.36
     idleness
    -0.32
     Thacker
    -0.31
    kurat
    -0.31
     privées
    -0.31
     שוליים
    -0.30
    Artifact
    -0.30
     Rug
    -0.30
     başına
    -0.30
    POSITIVE LOGITS
     favorably
    0.68
     favourably
    0.66
     positive
    0.65
    Positive
    0.65
     Positive
    0.65
    positive
    0.65
     positif
    0.64
     POSITIVE
    0.62
     favourable
    0.60
     favorable
    0.60
    Act Density 0.194%

    No Known Activations