INDEX
    Explanations

    possessive forms of nouns

    New Auto-Interp
    Negative Logits
    z
    -0.25
    y
    -0.22
    e
    -0.21
    i
    -0.21
    er
    -0.20
    c
    -0.20
    al
    -0.20
    a
    -0.20
    Ùĩ
    -0.19
    à¸Ļ
    -0.19
    POSITIVE LOGITS
    ï¸ı
    0.18
    ζη
    0.17
    pha
    0.14
    aman
    0.14
    ÐĽÐ¬
    0.13
    /-
    0.13
    .au
    0.13
    inn
    0.13
    /DD
    0.13
    lef
    0.13
    Act Density 0.098%

    No Known Activations