INDEX
Explanations
concepts related to moral dilemmas and the complexities of rights and actions
New Auto-Interp
Negative Logits
andon
-0.15
ycop
-0.14
aml
-0.14
ãĥ¬ãĥ³
-0.14
legate
-0.14
ihan
-0.13
-alist
-0.13
ABL
-0.13
izabeth
-0.13
Normalization
-0.13
POSITIVE LOGITS
benefit
0.49
Benefit
0.39
benef
0.36
advance
0.34
benefits
0.33
benef
0.32
benefited
0.32
Benef
0.32
benefiting
0.31
BEN
0.30
Activations Density 0.479%