INDEX
Explanations
actions and processes related to creativity, development, and improvement in various contexts
New Auto-Interp
Negative Logits
/from
-0.27
/her
-0.22
/of
-0.21
/the
-0.20
/on
-0.19
/to
-0.18
/or
-0.18
/she
-0.16
/out
-0.15
/how
-0.15
POSITIVE LOGITS
ä¸Ģä¸ĭ
0.25
the
0.22
a
0.21
ulate
0.21
them
0.21
both
0.21
some
0.21
ively
0.20
what
0.20
çļĦæĺ¯
0.20
Activations Density 2.389%