INDEX
Explanations
phrases expressing desire or invitations
New Auto-Interp
Negative Logits
âķIJ
-0.17
IEW
-0.15
Bubble
-0.14
zew
-0.14
iece
-0.14
ies
-0.14
rotch
-0.14
ocol
-0.14
ortex
-0.13
aft
-0.13
POSITIVE LOGITS
extend
0.24
extended
0.23
extends
0.22
extended
0.20
extend
0.20
thank
0.19
Extended
0.19
extends
0.18
åĬ
0.18
Extended
0.17
Activations Density 0.035%