INDEX
Explanations
phrases related to commands, warnings, and instructions
phrases that express certainty or existence
New Auto-Interp
Negative Logits
icio
-0.56
artney
-0.55
arlane
-0.54
ibaba
-0.52
76561
-0.50
yip
-0.50
20439
-0.49
ento
-0.49
yrinth
-0.49
uggest
-0.49
POSITIVE LOGITS
!
1.36
!:
1.33
!.
1.29
;)
1.23
!!!
1.20
.:
1.19
ðŁĻĤ
1.18
:)
1.17
!!!!
1.17
!,
1.17
Activations Density 0.737%