INDEX
Explanations
concepts related to moral rights and ethical dilemmas
New Auto-Interp
Negative Logits
Split
-0.15
empl
-0.15
_syntax
-0.15
Transparent
-0.14
credit
-0.13
IEW
-0.13
transparent
-0.13
akis
-0.13
alten
-0.13
Represents
-0.13
POSITIVE LOGITS
society
0.26
rights
0.21
societies
0.20
laws
0.18
duties
0.17
mor
0.17
Bent
0.17
liberty
0.17
government
0.17
Rights
0.17
Activations Density 0.938%