INDEX
    Explanations

    words and phrases related to personality traits, specifically introversion and extroversion

    New Auto-Interp
    Negative Logits
    ide
    -0.15
     Pest
    -0.14
    aldo
    -0.14
     Platt
    -0.14
    leigh
    -0.14
    ringe
    -0.14
    eÄį
    -0.14
    otton
    -0.14
    eton
    -0.14
     Monroe
    -0.14
    POSITIVE LOGITS
     unm
    0.17
    imar
    0.16
    version
    0.15
    verts
    0.15
    ãĥ¼ãĥŃ
    0.15
     estate
    0.14
    prung
    0.14
    ool
    0.14
     Gel
    0.14
    super
    0.14
    Act Density 0.027%

    No Known Activations