INDEX
    Explanations

    repetitions of the word "you."

    New Auto-Interp
    Negative Logits
    onymous
    -2.10
    ogr
    -1.99
     asleep
    -1.66
    otle
    -1.63
    igraph
    -1.58
    ocal
    -1.55
    emic
    -1.52
    onym
    -1.51
    uric
    -1.50
    ocur
    -1.50
    POSITIVE LOGITS
    ´
    2.22
    ī
    2.11
    Ŀ
    2.08
    illard
    2.07
    ettes
    2.06
    ľ
    2.01
    ¯
    2.01
    ffer
    1.99
    ¿
    1.89
    µ
    1.89
    Act Density 0.174%

    No Known Activations