INDEX
    Explanations

    references to inclusivity and universality

    New Auto-Interp
    Negative Logits
    kle
    -0.16
    Ì
    -0.16
    relude
    -0.15
    eral
    -0.14
    elia
    -0.14
    ysi
    -0.14
    lz
    -0.13
    uard
    -0.13
    609
    -0.13
    lü
    -0.13
    POSITIVE LOGITS
     sorts
    0.17
    walk
    0.17
    /Dk
    0.16
     sexes
    0.16
    ninger
    0.16
     seasons
    0.16
     genders
    0.15
    286
    0.15
    createForm
    0.15
     quanh
    0.15
    Act Density 0.128%

    No Known Activations