INDEX
    Explanations

    gendered nouns and their associated articles in a variety of contexts

    New Auto-Interp
    Negative Logits
    amient
    -0.15
     Cabr
    -0.15
    ista
    -0.14
    otland
    -0.14
     Tribe
    -0.14
    adium
    -0.14
    onn
    -0.14
    shed
    -0.14
     togg
    -0.14
    véd
    -0.14
    POSITIVE LOGITS
    warn
    0.15
    олÑı
    0.14
    wart
    0.14
    arro
    0.14
    itto
    0.14
    une
    0.14
    anny
    0.14
    anson
    0.14
    unto
    0.14
    dum
    0.14
    Act Density 0.074%

    No Known Activations