INDEX
Explanations
phrases related to willingness or readiness to do something
the phrase "willing to" in various contexts
New Auto-Interp
Negative Logits
Compass
-0.86
Offline
-0.77
////////////////////////////////
-0.74
_.
-0.73
-0.72
ARS
-0.66
Detected
-0.66
Veh
-0.64
Haunted
-0.63
Wan
-0.63
POSITIVE LOGITS
accept
1.14
embrace
1.14
sacrifice
1.03
cooperate
1.03
give
1.02
abandon
1.00
tolerate
0.98
admit
0.96
undertake
0.96
sell
0.95
Activations Density 0.078%