INDEX
    Explanations

    words related to excitement or enjoyment

    New Auto-Interp
    Negative Logits
    gra
    -0.17
    yyy
    -0.17
    olis
    -0.17
    tt
    -0.15
    yyyy
    -0.15
    y
    -0.15
    sur
    -0.14
    ytt
    -0.14
    veral
    -0.14
    sol
    -0.14
    POSITIVE LOGITS
    ey
    0.23
    igans
    0.20
    chie
    0.20
    ze
    0.18
    peed
    0.18
    iple
    0.17
    zers
    0.17
    porno
    0.17
    zy
    0.16
    ie
    0.16
    Act Density 0.016%

    No Known Activations