INDEX
    Explanations

    expressions of gratitude and acknowledgments of contributions

    New Auto-Interp
    Negative Logits
    orro
    -0.16
     Virt
    -0.16
    virt
    -0.15
    ND
    -0.15
    zel
    -0.14
     erfahren
    -0.14
    vor
    -0.13
    ÃĮ
    -0.13
    tee
    -0.13
    rous
    -0.13
    POSITIVE LOGITS
     everyone
    0.26
     all
    0.26
    æīĢæľī
    0.24
     wszyst
    0.22
     semua
    0.22
    ãģĻãģ¹ãģ¦
    0.21
     everybody
    0.20
    everyone
    0.20
     جÙħÙĬع
    0.18
     tất
    0.18
    Act Density 0.076%

    No Known Activations