INDEX
Explanations
phrases emphasizing inclusivity and unity
phrases emphasizing inclusivity and the common good for diverse groups
New Auto-Interp
Negative Logits
potion
-0.77
illac
-0.67
Kamp
-0.63
Schwar
-0.63
Caption
-0.63
anova
-0.62
ritic
-0.61
utsche
-0.60
oute
-0.60
aminer
-0.60
POSITIVE LOGITS
kinds
1.26
ocating
1.20
sorts
1.16
igators
1.06
iances
1.03
iance
1.00
facets
1.00
ogene
0.97
owing
0.93
genders
0.92
Activations Density 0.120%