INDEX
Explanations
punctuation marks, especially periods and exclamation points
sentences that convey a sense of urgency or importance
New Auto-Interp
Negative Logits
glim
-0.81
oun
-0.79
awei
-0.75
quir
-0.75
rall
-0.73
ailability
-0.72
onga
-0.72
excav
-0.68
appra
-0.68
overcl
-0.68
POSITIVE LOGITS
:(
1.15
pic
1.12
Sorry
1.11
:)
1.09
Please
1.05
Seems
1.05
Thanks
1.04
Thank
1.04
:-)
1.00
0.98
Activations Density 0.409%