INDEX
    Explanations

    Ö and similar characters

    New Auto-Interp
    Negative Logits
     words
    0.46
    φων
    0.45
    zeros
    0.41
     fora
    0.41
     coraz
    0.40
    words
    0.40
     zeros
    0.38
    0.38
     numbered
    0.38
     поду
    0.38
    POSITIVE LOGITS
    टक
    0.40
    ürü
    0.40
    0.40
    nti
    0.39
     افر
    0.39
    0.38
     Suggestions
    0.38
    一身
    0.38
    특히
    0.37
    ティック
    0.37
    Act Density 0.002%

    No Known Activations