INDEX
Negative Logits
Grain
0.35
san
0.34
grep
0.34
கல்லறை
0.34
период
0.34
期間
0.33
gobiernos
0.33
Period
0.33
සා
0.32
வீன
0.32
POSITIVE LOGITS
gender
0.65
dysph
0.60
genders
0.58
roles
0.55
stereotypes
0.54
Gender
0.52
gender
0.52
relations
0.50
Gender
0.49
sex
0.48
Activations Density 0.038%