INDEX
    Explanations

    repetitions of the word "one."

    New Auto-Interp
    Negative Logits
    ly
    -0.82
    UnusedPrivate
    -0.70
     ―――――
    -0.70
    CrossRef
    -0.69
     inox
    -0.68
     Bettina
    -0.67
     raiſ
    -0.66
     대해
    -0.66
    titian
    -0.65
     ſch
    -0.63
    POSITIVE LOGITS
     ONE
    1.25
     One
    1.24
    One
    1.18
     one
    1.16
    ONE
    1.07
    one
    1.07
     jednego
    0.88
    updateOne
    0.88
     jedną
    0.86
     jeden
    0.86
    Act Density 0.158%

    No Known Activations