INDEX
    Explanations

    positive personality traits such as being friendly and cheerful

    words associated with friendliness and warmth in social interactions

    New Auto-Interp
    Negative Logits
    illion
    -0.83
    rast
    -0.80
    liga
    -0.80
    ahar
    -0.79
    IGHTS
    -0.78
    ember
    -0.78
    anish
    -0.75
    hner
    -0.75
    ĸļ
    -0.75
    iple
    -0.74
    POSITIVE LOGITS
     confines
    0.92
     friendly
    0.84
     minded
    0.78
     Friendly
    0.76
    lier
    0.75
     greeting
    0.73
    liness
    0.72
     hello
    0.72
     introdu
    0.70
     disposition
    0.70
    Act Density 0.018%

    No Known Activations