INDEX
    Explanations

    characters or words written in a specific script

    New Auto-Interp
    Negative Logits
    Õ
    -0.17
    stry
    -0.16
    157
    -0.15
    ĥĿ
    -0.15
     ye
    -0.15
     idi
    -0.15
    áŀ
    -0.15
    á
    -0.14
    ج
    -0.14
    ай
    -0.14
    POSITIVE LOGITS
    à¥į
    0.22
    à¥
    0.20
     म
    0.20
     स
    0.19
     à¤ķ
    0.19
    à¥ĩ
    0.19
    ा
    0.19
     द
    0.18
     प
    0.18
    à¥įâĢį
    0.18
    Act Density 0.017%

    No Known Activations