INDEX
Explanations
words related to being well-informed or making responsible decisions
terms related to informed decision-making and responsibility
New Auto-Interp
Negative Logits
stals
-0.83
estamp
-0.73
AMS
-0.72
bows
-0.70
mysteriously
-0.68
Devi
-0.68
nen
-0.67
oried
-0.66
yip
-0.65
Reloaded
-0.65
POSITIVE LOGITS
observer
1.07
consideration
1.06
disclosure
1.00
consent
0.97
contemplation
0.96
decision
0.96
adherence
0.95
stewards
0.93
ness
0.92
bystand
0.92
Activations Density 0.133%