INDEX
Explanations
phrases indicating a sense of responsibility or duty
references to job responsibilities and duties
New Auto-Interp
Negative Logits
rumours
-0.74
eros
-0.74
rumors
-0.71
Lot
-0.71
acan
-0.70
ongh
-0.70
axy
-0.68
pot
-0.67
ston
-0.66
AX
-0.66
POSITIVE LOGITS
obligation
0.84
responsibility
0.71
competence
0.70
duty
0.68
emancipation
0.68
restraint
0.68
constraint
0.68
task
0.67
theorem
0.65
respons
0.65
Activations Density 0.151%