INDEX
    Explanations

    aspirations and objectives

    New Auto-Interp
    Negative Logits
    тра
    0.65
    𝐬
    0.64
    нуться
    0.63
    ции
    0.62
    ذ
    0.61
    ма
    0.61
    ъ
    0.61
    с
    0.58
    ті
    0.57
    \
    0.55
    POSITIVE LOGITS
    ES
    0.80
    ע
    0.79
    IS
    0.72
    EN
    0.71
    AIN
    0.71
    ED
    0.71
    ↵↵
    0.70
    IG
    0.69
    0.68
    ET
    0.67
    Act Density 1.074%

    No Known Activations