INDEX
    Explanations

    comparative relationships and positive qualities

    evaluative language that reflects opinions about individual worth and capability

    New Auto-Interp
    Negative Logits
     separat
    -0.72
     occurs
    -0.71
     VIDEOS
    -0.71
    Seg
    -0.66
     excludes
    -0.66
     disparity
    -0.66
    Demand
    -0.66
    ARM
    -0.66
    VERTISEMENT
    -0.64
     Conver
    -0.64
    POSITIVE LOGITS
     proud
    1.05
     happiest
    0.99
     lucky
    0.93
     wiser
    0.90
     happy
    0.88
     aware
    0.86
     laughing
    0.85
     pleased
    0.85
     interested
    0.84
    ointed
    0.84
    Act Density 0.648%

    No Known Activations