INDEX
    Explanations

    Russian adjective endings

    New Auto-Interp
    Negative Logits
     použit
    1.72
    ות
    1.70
    ed
    1.66
    1.63
    edly
    1.58
    1.57
    ει
    1.55
    ities
    1.55
    '',
    1.52
     дә
    1.52
    POSITIVE LOGITS
    гация
    1.61
    տ
    1.60
    ч
    1.58
    cr
    1.56
    ري
    1.55
    См
    1.46
    rilor
    1.42
    1.41
    ❤️❤️
    1.41
    1.39
    Act Density 0.085%

    No Known Activations