INDEX
Explanations
phrases related to gender differences and societal expectations
Followed by a noun
descriptive phrases and specific concepts
New Auto-Interp
Negative Logits
الحياه
-0.65
Accesat
-0.65
Gön
-0.62
totiž
-0.60
GetAxis
-0.60
ljeno
-0.60
agaimana
-0.60
tartalomajánló
-0.57
Geraadpleegd
-0.56
tajam
-0.55
POSITIVE LOGITS
</h2>
1.72
</h4>
1.47
</h3>
1.41
</strong>
1.40
</h5>
1.38
</b>
1.34
</u>
1.21
</h1>
1.21
</h6>
1.14
}$}
1.03
Activations Density 0.743%