INDEX
Explanations
conditional statements indicating a desire or preference
expressions of desire or intention
New Auto-Interp
Negative Logits
Xuan
-0.63
Trap
-0.61
Gutenberg
-0.61
Frag
-0.59
Fortune
-0.59
Kag
-0.59
Gad
-0.59
Kinder
-0.59
pires
-0.58
Hebdo
-0.58
POSITIVE LOGITS
advise
1.14
recommend
1.12
sugg
1.08
gladly
1.05
prefer
1.04
characterize
1.03
appreciate
1.00
nominate
1.00
suggest
0.99
expect
0.98
Activations Density 0.071%