INDEX
    Explanations

    variations of the word "no"

    New Auto-Interp
    Negative Logits
    ko
    -0.17
    rech
    -0.17
    nee
    -0.15
    neath
    -0.15
    kin
    -0.15
    rego
    -0.15
    ray
    -0.15
    cour
    -0.15
    rik
    -0.15
    rick
    -0.14
    POSITIVE LOGITS
    xious
    0.30
     longer
    0.28
    things
    0.27
    zzle
    0.26
     matter
    0.26
     doubt
    0.25
    venta
    0.25
    isy
    0.25
    veau
    0.24
    ël
    0.24
    Act Density 0.043%

    No Known Activations