INDEX
    Explanations

    words and phrases that convey friendliness and positive social interactions

    New Auto-Interp
    Negative Logits
    Ìĥ
    -0.16
    ãĤ¤ãĤ¯
    -0.15
     cpp
    -0.15
    ngo
    -0.14
     greed
    -0.14
    nam
    -0.14
    oen
    -0.14
    iao
    -0.14
    .Throw
    -0.13
    ngth
    -0.13
    POSITIVE LOGITS
    inkel
    0.19
    assen
    0.18
    yyyy
    0.16
     enough
    0.15
    iswa
    0.15
    liness
    0.15
    lier
    0.15
    ãĥ¶
    0.15
    ness
    0.14
     WithEvents
    0.14
    Act Density 0.023%

    No Known Activations