INDEX
    Explanations

    words related to communication and instruction

    New Auto-Interp
    Negative Logits
     yourself
    -0.33
     Yourself
    -0.28
     yourselves
    -0.23
    your
    -0.22
    Your
    -0.21
     можеÑĤе
    -0.18
    ï¼Įä½ł
    -0.18
     your
    -0.18
     Ihrem
    -0.17
     Your
    -0.17
    POSITIVE LOGITS
     him
    0.31
     thee
    0.29
     ya
    0.23
    cha
    0.23
     ihn
    0.23
     us
    0.22
    CHA
    0.20
    inya
    0.19
     Ihnen
    0.19
     ÑĤебÑı
    0.19
    Act Density 0.200%

    No Known Activations