INDEX
    Explanations

    instances of the word "many."

    New Auto-Interp
    Negative Logits
    icari
    -0.18
    sm
    -0.16
    umer
    -0.14
    eldorf
    -0.14
    jen
    -0.14
    istrovstvÃŃ
    -0.14
    nist
    -0.14
    ses
    -0.14
    eniable
    -0.14
    cad
    -0.14
    POSITIVE LOGITS
    ToMany
    0.25
    -many
    0.23
     times
    0.22
    /all
    0.22
    -times
    0.21
    -sided
    0.19
     different
    0.18
    ãĢħ
    0.17
    fold
    0.17
     of
    0.16
    Act Density 0.077%

    No Known Activations