INDEX
    Explanations

    phrases that emphasize any variety of situations or elements

    New Auto-Interp
    Negative Logits
     quelques
    -0.16
     Some
    -0.16
    ÙĬÙĩ
    -0.15
     some
    -0.15
    ania
    -0.15
    ä¸ĢäºĽ
    -0.15
    reira
    -0.14
    еÑı
    -0.14
    ped
    -0.14
    rej
    -0.14
    POSITIVE LOGITS
    /all
    0.33
    ones
    0.30
    THING
    0.30
    place
    0.28
     sort
    0.25
     kind
    0.24
    one
    0.23
    thin
    0.23
    kind
    0.22
    ONE
    0.22
    Act Density 0.099%

    No Known Activations