INDEX
    Explanations

    occurrences of the word "ou."

    New Auto-Interp
    Negative Logits
    ni
    -0.19
    nels
    -0.17
    ammers
    -0.16
    ãĥ³
    -0.15
    ases
    -0.15
    nan
    -0.15
    nu
    -0.15
    rieving
    -0.15
    nic
    -0.15
    nie
    -0.15
    POSITIVE LOGITS
    nger
    0.18
    illet
    0.18
    cou
    0.17
    thern
    0.17
    ltre
    0.17
    lt
    0.17
    verture
    0.17
    eurs
    0.17
    ette
    0.16
    theast
    0.16
    Act Density 0.031%

    No Known Activations