INDEX
    Explanations

    unique cultural references and names related to specific locations or subjects

    New Auto-Interp
    Negative Logits
    lar
    -0.28
    ìķĺ
    -0.24
    ra
    -0.24
    ca
    -0.23
    ìķĺëĭ¤
    -0.23
    ban
    -0.23
    va
    -0.20
    ça
    -0.20
    ja
    -0.19
    ba
    -0.19
    POSITIVE LOGITS
    zet
    0.28
    iben
    0.21
    etty
    0.20
    ben
    0.19
    ye
    0.19
    dre
    0.18
    ül
    0.18
    де
    0.18
    ivel
    0.17
    inde
    0.17
    Act Density 0.005%

    No Known Activations