INDEX
    Explanations

    references to introverted personality traits and behaviors

    New Auto-Interp
    Negative Logits
    emd
    -0.16
    vocab
    -0.16
    ocr
    -0.16
    apus
    -0.15
    esture
    -0.15
    IMENT
    -0.15
    onnement
    -0.15
    adge
    -0.15
    steller
    -0.15
    lez
    -0.14
    POSITIVE LOGITS
    ennon
    0.17
     trait
    0.16
     Mattis
    0.16
    raits
    0.15
     traits
    0.15
    309
    0.15
     compensation
    0.15
     Tow
    0.14
     toward
    0.14
     subs
    0.14
    Act Density 0.026%

    No Known Activations