INDEX
Explanations
exclamations suggesting surprise or disbelief
expressions of surprise or strong emotion
New Auto-Interp
Negative Logits
certain
-0.70
generally
-0.64
partly
-0.63
general
-0.61
broadly
-0.61
similar
-0.60
grav
-0.60
various
-0.60
predominantly
-0.58
principally
-0.58
POSITIVE LOGITS
?!
2.56
!!
2.45
!!!
2.38
!!!!
2.27
!!!!!
2.22
!",
2.16
!".
2.13
!
2.12
!),
2.08
!).
2.05
Activations Density 0.032%