INDEX
Explanations
expressions of gratitude and requests
phrases that express intent or requests
New Auto-Interp
Negative Logits
furt
-0.76
Nash
-0.66
grounds
-0.66
wise
-0.63
Abel
-0.63
bars
-0.62
raw
-0.62
Nile
-0.62
metadata
-0.60
cases
-0.59
POSITIVE LOGITS
emulate
1.06
propose
1.03
encourage
1.01
recreate
1.00
participate
0.97
congratulate
0.96
clarify
0.93
capitalize
0.92
assure
0.92
postpone
0.91
Activations Density 0.065%