INDEX
    Explanations

    expressions of intent or desire for collaboration

    New Auto-Interp
    Negative Logits
     my
    -0.31
     saya
    -0.31
     tôi
    -0.29
     they
    -0.28
    æĪijçļĦ
    -0.27
     mijn
    -0.26
     meiner
    -0.25
    æĪij
    -0.25
     há»į
    -0.25
     their
    -0.25
    POSITIVE LOGITS
     ourselves
    0.74
     we
    0.54
     ours
    0.48
    æĪij们
    0.44
    æĪijåĢij
    0.43
     our
    0.41
    we
    0.39
     μαÏĤ
    0.39
    ï¼ĮæĪij们
    0.36
     ìļ°ë¦¬ëĬĶ
    0.36
    Act Density 0.013%

    No Known Activations