INDEX
Explanations
phrases indicating uncertainty or speculation
conditional statements and expressions of uncertainty
New Auto-Interp
Negative Logits
kefeller
-0.66
olini
-0.65
ipal
-0.58
Hands
-0.56
pione
-0.56
Andersen
-0.55
Citiz
-0.54
glers
-0.53
Shots
-0.53
ãĤ¶
-0.52
POSITIVE LOGITS
.
1.17
!
1.14
!.
1.14
.[
1.12
.--
1.12
.(
1.08
;)
1.07
:)
1.06
.<
1.06
!,
1.04
Activations Density 0.599%