INDEX
Explanations
exclamations and emotionally charged language
expressions of personal identity and relationships
New Auto-Interp
Negative Logits
SPONSORED
-0.98
actionDate
-0.75
)]
-0.73
largeDownload
-0.71
principally
-0.69
unrem
-0.68
)].
-0.67
rhet
-0.62
footnote
-0.61
pragmatic
-0.60
POSITIVE LOGITS
!
2.19
!!
2.14
!!!
2.03
!!!!
1.91
?!
1.81
!!!!!
1.80
!?
1.80
!'
1.74
!.
1.73
!:
1.72
Activations Density 0.508%