INDEX
Explanations
terms related to feminism and its various expressions
New Auto-Interp
Negative Logits
iol
-0.17
ays
-0.17
aim
-0.16
Carnegie
-0.16
æģ¯
-0.16
rig
-0.15
iam
-0.15
exp
-0.14
pc
-0.14
ocking
-0.14
POSITIVE LOGITS
inity
0.21
ine
0.19
inine
0.18
INE
0.17
oir
0.16
icide
0.15
ãĥ¼ãĥį
0.15
inely
0.14
isting
0.14
inite
0.14
Activations Density 0.008%