INDEX
Explanations
terms related to misogyny
terms related to misogyny and its manifestations
New Auto-Interp
Negative Logits
FTWARE
-0.75
DCS
-0.70
rendition
-0.69
Ley
-0.67
Bulldogs
-0.65
Hulk
-0.65
fracturing
-0.64
HER
-0.63
Lans
-0.63
optic
-0.63
POSITIVE LOGITS
orship
1.06
oir
1.05
xes
0.95
uses
0.95
istries
0.89
orem
0.88
andro
0.86
oret
0.85
orious
0.85
otes
0.84
Activations Density 0.027%