INDEX
    Explanations

    specific names of people, organizations, and locations

    New Auto-Interp
    Negative Logits
    ģm
    -0.36
    erm
    -0.30
    ám
    -0.29
    äm
    -0.29
    orm
    -0.29
    óm
    -0.28
    ãĥ¼ãĥł
    -0.28
    mam
    -0.28
    irm
    -0.28
    imum
    -0.28
    POSITIVE LOGITS
    à¸Ļà¸Ń
    0.13
    ınca
    0.13
    tıģ
    0.12
    çĶ
    0.12
    tıģı
    0.12
    isen
    0.11
    даеÑĤ
    0.10
    ılıp
    0.10
    ;line
    0.10
    ξε
    0.10
    Act Density 0.456%

    No Known Activations