INDEX
Explanations
exclamatory phrases expressing strong emotions
New Auto-Interp
Negative Logits
atories
-0.74
oult
-0.70
unker
-0.66
aldi
-0.65
saline
-0.64
ativity
-0.62
lining
-0.62
auld
-0.61
reen
-0.61
transpl
-0.61
POSITIVE LOGITS
!!!!!
1.14
!!!
1.03
!!
0.98
!/
0.90
@#
0.87
:-)
0.84
9999
0.83
!!!!
0.82
?!
0.82
Please
0.78
Activations Density 5.869%