INDEX
    Explanations

    positive or negative evaluative words

    positive and negative evaluations or judgments about situations

    New Auto-Interp
    Negative Logits
    aneers
    -0.76
    ngth
    -0.76
    arij
    -0.76
    ividual
    -0.71
    assemb
    -0.69
    iries
    -0.68
    velop
    -0.67
    rive
    -0.67
    mop
    -0.65
    icipated
    -0.64
    POSITIVE LOGITS
     considering
    1.09
     because
    0.95
     ðŁĻĤ
    0.79
    !
    0.79
     eh
    0.78
     reasoning
    0.77
    soType
    0.75
     news
    0.75
    because
    0.75
     advice
    0.74
    Act Density 0.160%

    No Known Activations