INDEX
    Explanations

    words related to the characteristics and behaviors of people

    statements about human traits or behaviors and their tendencies

    New Auto-Interp
    Negative Logits
    osate
    -0.81
    ospons
    -0.80
    bernatorial
    -0.75
     Anniversary
    -0.75
    andise
    -0.73
    ornia
    -0.71
     legality
    -0.70
    inia
    -0.69
    reau
    -0.67
    tains
    -0.66
    POSITIVE LOGITS
     aware
    1.19
     incapable
    1.18
     happiest
    1.18
     smarter
    1.17
     unaware
    1.15
     accustomed
    1.13
     afraid
    1.11
     happier
    1.11
     obsessed
    1.10
     willing
    1.09
    Act Density 0.256%

    No Known Activations