INDEX
    Explanations

    words related to attitudes and opinions

    references to social attitudes and perceptions

    New Auto-Interp
    Negative Logits
    ded
    -0.79
    gran
    -0.74
    amen
    -0.73
    avez
    -0.71
    addafi
    -0.70
    aman
    -0.67
    cuts
    -0.66
     Jub
    -0.64
    Delivery
    -0.64
    MER
    -0.63
    POSITIVE LOGITS
     attitudes
    1.07
     guiActiveUn
    0.89
    pring
    0.86
    ocial
    0.84
    terness
    0.82
     toward
    0.79
    insula
    0.76
     towards
    0.76
    hovah
    0.76
    yip
    0.75
    Act Density 0.013%

    No Known Activations