INDEX
Explanations
expressions of intent or desire for collaboration
New Auto-Interp
Negative Logits
my
-0.31
saya
-0.31
tôi
-0.29
they
-0.28
æĪijçļĦ
-0.27
mijn
-0.26
meiner
-0.25
æĪij
-0.25
há»į
-0.25
their
-0.25
POSITIVE LOGITS
ourselves
0.74
we
0.54
ours
0.48
æĪij们
0.44
æĪijåĢij
0.43
our
0.41
we
0.39
μαÏĤ
0.39
ï¼ĮæĪij们
0.36
ìļ°ë¦¬ëĬĶ
0.36
Activations Density 0.013%