INDEX
Explanations
phrases indicating relationships and social dynamics
New Auto-Interp
Negative Logits
brotherhood
-0.85
łbym
-0.80
menswear
-0.75
masculinity
-0.75
manhood
-0.75
Grandpa
-0.75
Grandfather
-0.74
Mr
-0.74
grandpa
-0.74
fathers
-0.74
POSITIVE LOGITS
woman
1.56
women
1.54
girl
1.44
female
1.37
Women
1.37
lady
1.35
woman
1.35
women
1.32
girls
1.32
Woman
1.31
Activations Density 1.217%