INDEX
Explanations
constructions discussing rights, particularly emphasizing expression and portrayal of individuals, with a focus on gender representation
New Auto-Interp
Negative Logits
Efq
-0.93
myſelf
-0.81
ſeveral
-0.80
ſelves
-0.77
Majefty
-0.74
feveral
-0.71
Jefus
-0.71
Anſ
-0.70
Monfieur
-0.70
Perſ
-0.70
POSITIVE LOGITS
ništvo
0.58
vej
0.53
好不容易
0.52
ouncements
0.48
ьаж
0.48
исленность
0.48
worst
0.47
Women
0.46
מיט
0.46
spice
0.45
Activations Density 0.272%