INDEX
    Explanations

    Cyrillic characters

    specific Cyrillic characters, particularly related to the letter "н" and variations of it

    New Auto-Interp
    Negative Logits
    merce
    -0.86
    undai
    -0.84
    ichita
    -0.82
    perature
    -0.81
    kins
    -0.81
    atche
    -0.78
    yip
    -0.77
    eanor
    -0.77
    HCR
    -0.74
    pload
    -0.74
    POSITIVE LOGITS
    и
    1.15
    оÐ
    1.07
    а
    1.06
    н
    1.02
    ÑĤ
    1.02
    о
    1.00
    е
    0.97
    Ñĭ
    0.96
    к
    0.93
    Ñ
    0.93
    Act Density 0.008%

    No Known Activations