INDEX
    Explanations

    expressions emphasizing collective achievements or identities

    New Auto-Interp
    Negative Logits
     Incontri
    -0.17
    adu
    -0.15
    inki
    -0.15
    yun
    -0.14
    pliant
    -0.14
     Replies
    -0.13
    Ñĥка
    -0.13
     Scha
    -0.13
    ASK
    -0.13
     подк
    -0.13
    POSITIVE LOGITS
    arhus
    0.15
    readcr
    0.15
    Ø´ÙĨ
    0.14
    ido
    0.14
     Chess
    0.14
     ware
    0.14
    ุà¸ķ
    0.14
    oque
    0.13
    gons
    0.13
    ¼åIJĪ
    0.13
    Act Density 0.463%

    No Known Activations