INDEX
    Explanations

    instances of a specific word across multiple languages

    New Auto-Interp
    Negative Logits
    oud
    -0.17
    ones
    -0.17
    ONES
    -0.16
    gmt
    -0.15
    çĦ¶
    -0.15
    >Main
    -0.14
    oods
    -0.14
    lad
    -0.14
    ibly
    -0.14
     ing
    -0.14
    POSITIVE LOGITS
    alic
    0.29
    ordin
    0.27
    ordinator
    0.25
    oper
    0.23
    оÑĢдин
    0.23
    OPER
    0.21
    ordinate
    0.21
    operator
    0.21
    наÑĩ
    0.19
    opers
    0.19
    Act Density 0.009%

    No Known Activations