INDEX
    Explanations

    en followed by French words

    New Auto-Interp
    Negative Logits
     nouve
    -0.10
     plage
    -0.10
     Rog
    -0.09
    gue
    -0.09
    urf
    -0.09
    uds
    -0.09
     Rae
    -0.09
     plaisir
    -0.09
     rencontre
    -0.09
    cient
    -0.09
    POSITIVE LOGITS
    vers
    0.11
    lever
    0.11
     cascade
    0.11
    .wikipedia
    0.11
     attendant
    0.11
    raci
    0.11
    rig
    0.10
    flamm
    0.10
     Ir
    0.10
    onces
    0.10
    Act Density 0.036%

    No Known Activations