INDEX
Explanations
phrases expressing personal preferences or recommendations
expressions of desire or intention
New Auto-Interp
Negative Logits
Gutenberg
-0.74
Kag
-0.69
çİĭ
-0.60
Basics
-0.60
Anim
-0.59
Kinder
-0.59
Cook
-0.59
PLUS
-0.58
DATA
-0.57
Material
-0.56
POSITIVE LOGITS
gladly
1.04
advise
0.91
dearly
0.89
definitely
0.88
never
0.86
recommend
0.86
sugg
0.85
assume
0.83
summarize
0.83
certainly
0.83
Activations Density 0.083%