INDEX
    Explanations

    phrases related to gender differences and societal expectations

    descriptive phrases and specific concepts

    New Auto-Interp
    Negative Logits
    الحياه
    -0.65
     Accesat
    -0.65
    Gön
    -0.62
     totiž
    -0.60
    GetAxis
    -0.60
    ljeno
    -0.60
    agaimana
    -0.60
     tartalomajánló
    -0.57
     Geraadpleegd
    -0.56
     tajam
    -0.55
    POSITIVE LOGITS
    </h2>
    1.72
    </h4>
    1.47
    </h3>
    1.41
    </strong>
    1.40
    </h5>
    1.38
    </b>
    1.34
    </u>
    1.21
    </h1>
    1.21
    </h6>
    1.14
    }$}
    1.03
    Act Density 0.743%

    No Known Activations