INDEX
    Explanations

    references to unusual or unconventional characteristics

    New Auto-Interp
    Negative Logits
    arian
    -0.17
    apor
    -0.16
    è±Ĭ
    -0.15
    tee
    -0.15
    owitz
    -0.15
    toi
    -0.14
    azer
    -0.14
    raphics
    -0.14
     Mori
    -0.14
     Kra
    -0.14
    POSITIVE LOGITS
    ball
    0.40
    ities
    0.32
    yssey
    0.31
    balls
    0.31
    -ball
    0.27
    ity
    0.26
    -number
    0.24
    ments
    0.23
    /e
    0.23
    Ball
    0.22
    Act Density 0.010%

    No Known Activations