INDEX
    Explanations

    proper nouns related to names and titles

    New Auto-Interp
    Negative Logits
    ender
    -0.17
    imuth
    -0.17
    umble
    -0.16
     setFrame
    -0.15
    ymm
    -0.15
    pez
    -0.15
    lero
    -0.15
    tero
    -0.15
    iem
    -0.15
    endra
    -0.15
    POSITIVE LOGITS
    ris
    0.24
    ree
    0.23
    rist
    0.21
    ring
    0.20
    ury
    0.20
    rik
    0.19
    reek
    0.19
    rim
    0.18
    idd
    0.18
    rin
    0.18
    Act Density 0.031%

    No Known Activations