INDEX
    Explanations

    religious titles and roles

    New Auto-Interp
    Negative Logits
    К
    1.36
    Д
    1.34
    П
    1.23
    к
    1.20
    Я
    1.19
    Ко
    1.10
    И
    1.10
    コの
    1.09
    কিন্ত
    1.07
    1.07
    POSITIVE LOGITS
    ose
    1.28
    ität
    1.26
    tum
    1.22
    eksi
    1.18
    𝐭
    1.18
    tors
    1.16
    t
    1.16
    en
    1.16
    ark
    1.16
    ON
    1.16
    Act Density 0.001%

    No Known Activations