INDEX
Explanations
phrases related to personal transformation and growth
New Auto-Interp
Negative Logits
—↵↵
-0.18
--↵↵
-0.17
--)↵
-0.16
ãĢ
-0.14
--)
-0.13
âĸį
-0.13
EqualTo
-0.13
')."
-0.12
%).↵↵
-0.12
raph
-0.12
POSITIVE LOGITS
;
0.19
:
0.19
.
0.18
;
0.17
:
0.15
),
0.15
=
0.15
|
0.14
?
0.14
!
0.14
Activations Density 1.573%