INDEX
Explanations
expressions of willingness or readiness to engage or take action
New Auto-Interp
Negative Logits
egl
-0.17
uga
-0.16
zan
-0.15
omy
-0.15
ially
-0.15
affle
-0.15
yun
-0.15
as
-0.15
yla
-0.15
OrDefault
-0.15
POSITIVE LOGITS
ness
0.27
willing
0.22
iam
0.20
NESS
0.20
ough
0.18
participant
0.18
suspension
0.17
kommen
0.17
ToUpdate
0.17
/un
0.17
Activations Density 0.017%