INDEX
    Explanations

    references to male pronouns and their variations

    New Auto-Interp
    Negative Logits
    الثة
    -0.49
    -0.45
     wedi
    -0.44
     nhiêu
    -0.44
     követ
    -0.43
    脚注の使い方
    -0.43
     galima
    -0.42
    umumkan
    -0.41
     estudian
    -0.41
    goa
    -0.41
    POSITIVE LOGITS
    __*/
    0.89
    ]]]
    0.82
    ']]
    0.81
    OGND
    0.79
    AsUp
    0.79
    }))
    
    0.77
     Audiodateien
    0.77
    (!__
    0.77
    .")]
    0.75
    ')))
    0.74
    Act Density 0.193%

    No Known Activations