INDEX
Explanations
references to female-related topics and attributes
New Auto-Interp
Negative Logits
in
-0.60
(
-0.57
I
-0.56
,
-0.56
-0.56
as
-0.55
or
-0.54
-0.54
and
-0.52
is
-0.51
POSITIVE LOGITS
NameInMap
1.09
Paglinawan
1.07
Houſe
1.00
Majefty
0.97
Efq
0.95
houſe
0.94
Jefus
0.92
LookAnd
0.91
Anſ
0.90
Tembelea
0.90
Activations Density 0.187%