INDEX
Explanations
phrases expressing desires or invitations
New Auto-Interp
Negative Logits
âķIJ
-0.17
ortex
-0.16
IEW
-0.15
ies
-0.14
obe
-0.14
cdf
-0.14
913
-0.14
eti
-0.13
essor
-0.13
mixer
-0.13
POSITIVE LOGITS
thank
0.25
Thank
0.22
thanking
0.22
THANK
0.21
Thank
0.20
thanked
0.19
ĸī
0.18
extend
0.18
thanks
0.18
express
0.18
Activations Density 0.037%