INDEX
Explanations
references to responsibility and accountability
New Auto-Interp
Negative Logits
German
-0.42
twin
-0.41
combin
-0.40
aerop
-0.39
Avic
-0.38
Eure
-0.38
Burt
-0.38
pose
-0.38
Mex
-0.38
MP
-0.37
POSITIVE LOGITS
responsibility
1.45
responsibility
1.38
Responsibility
1.29
responsabilidad
1.28
accountability
1.26
Responsibility
1.25
Verantwortung
1.22
responsabilité
1.16
ansvar
1.16
responsible
1.16
Activations Density 0.181%