INDEX
    Explanations

    words and phrases that imply inclusivity or the presence of multiple elements

    New Auto-Interp
    Negative Logits
    ãģ¾ãģŁ
    -0.17
    ä¹Ī
    -0.16
    igo
    -0.14
     доÑģÑĤ
    -0.14
    istr
    -0.14
     обо
    -0.14
    tır
    -0.13
     neither
    -0.13
    oster
    -0.13
    osit
    -0.13
    POSITIVE LOGITS
     ones
    0.48
     those
    0.45
    those
    0.34
    :
    0.32
    ones
    0.32
     Those
    0.29
    Those
    0.28
     Ones
    0.27
    :↵
    0.27
     تÙĦÙĥ
    0.24
    Act Density 0.205%

    No Known Activations