INDEX
Explanations
phrases related to being open to doing something or accepting a certain situation
expressions of willingness or ability to take action
New Auto-Interp
Negative Logits
icles
-0.87
icle
-0.86
verbs
-0.78
groups
-0.76
Surv
-0.71
Siber
-0.71
Tid
-0.69
DA
-0.68
idden
-0.68
inas
-0.66
POSITIVE LOGITS
willingness
1.02
attitude
0.87
unwillingness
0.86
yip
0.82
adherence
0.80
guiActiveUn
0.78
ï¸
0.78
reluctance
0.77
ACTIONS
0.75
reliance
0.75
Activations Density 0.032%