INDEX
Explanations
exclamatory expressions or interjections expressing surprise or amazement
expressions of surprise or excitement
New Auto-Interp
Negative Logits
rive
-0.62
actionDate
-0.58
arcity
-0.57
channelAvailability
-0.56
pend
-0.55
progressively
-0.55
viol
-0.54
redistributed
-0.54
icipated
-0.53
``
-0.53
POSITIVE LOGITS
zers
1.20
!
1.16
!,
1.08
hhh
1.08
!!
1.03
!!!!
1.02
ww
1.01
!!!
0.99
hh
0.99
hhhh
0.99
Activations Density 0.113%