INDEX
    Explanations

    common pronouns and auxiliary verbs indicating subjectivity and time

    New Auto-Interp
    Negative Logits
    deniz
    -0.16
    imir
    -0.15
    ãĥ«ãĥķ
    -0.15
    vier
    -0.15
    Ñģклад
    -0.15
    Ìģt
    -0.15
    grp
    -0.14
    oÄŁ
    -0.14
    rien
    -0.14
    combe
    -0.13
    POSITIVE LOGITS
    دÛĮد
    0.16
    ste
    0.15
    ÑĭÑĪ
    0.15
    153
    0.14
    illery
    0.14
    äch
    0.14
    669
    0.14
    POP
    0.14
    obs
    0.13
     Velvet
    0.13
    Act Density 0.015%

    No Known Activations