INDEX
    Explanations

    emotional expressions and experiences

    New Auto-Interp
    Negative Logits
     ATTRIBUTE
    -0.15
    ripe
    -0.14
    arto
    -0.14
    IDEO
    -0.14
     RECEIVER
    -0.13
    IPH
    -0.13
    Dick
    -0.13
    ENUM
    -0.13
    olars
    -0.13
    inen
    -0.13
    POSITIVE LOGITS
     YA
    0.23
     Authors
    0.22
     AUTHORS
    0.22
     Ner
    0.21
    Authors
    0.21
     authors
    0.20
     Na
    0.20
     fandom
    0.20
    YA
    0.19
     agents
    0.18
    Act Density 0.017%

    No Known Activations