INDEX
Explanations
instances where individuals or groups show a willingness to take action or make changes
references to willingness or unwillingness
New Auto-Interp
Negative Logits
og
-0.67
iner
-0.64
ogs
-0.63
ast
-0.62
uv
-0.60
!
-0.59
Mim
-0.59
ilk
-0.57
Neb
-0.57
quer
-0.57
POSITIVE LOGITS
willingness
3.48
unwillingness
2.36
reluctance
1.89
refusal
1.69
readiness
1.69
openness
1.68
propensity
1.64
willing
1.56
inability
1.55
desire
1.50
Activations Density 0.018%