INDEX
Explanations
phrases relating to ethics, values, and social principles
New Auto-Interp
Negative Logits
Giới
-0.40
AssemblyCulture
-0.40
tagHelperRunner
-0.40
Autoritní
-0.38
DockStyle
-0.37
BASEPATH
-0.35
yakit
-0.34
횟
-0.34
meras
-0.34
HandlerContext
-0.34
POSITIVE LOGITS
equality
0.92
fairness
0.90
transparency
0.84
honesty
0.84
tolerance
0.82
openness
0.81
inclusion
0.75
inclu
0.74
accountability
0.70
unity
0.69
Activations Density 0.537%