INDEX
Explanations
statements related to gender and sexual orientation issues, particularly involving sexism and LGBTQ+ topics
New Auto-Interp
Negative Logits
psilon
-0.17
roje
-0.15
ÑģÑĤан
-0.15
iddleware
-0.14
ibo
-0.14
lio
-0.14
ients
-0.14
nee
-0.14
.Pattern
-0.14
_defaults
-0.13
POSITIVE LOGITS
ven
0.17
foreign
0.16
dispatch
0.15
Foreign
0.15
Baker
0.15
crack
0.15
excursion
0.14
dispatch
0.14
Foreign
0.14
unic
0.14
Activations Density 0.134%