INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     in
    1.27
    но
    1.22
    ки
    1.17
    ف
    1.09
    ре
    1.08
    ку
    1.08
    ری
    1.08
    1.08
     at
    1.06
    я
    1.06
    POSITIVE LOGITS
    r
    1.52
    n
    1.26
    تهم
    1.19
    l
    1.19
    i
    1.16
     расходы
    1.11
    us
    1.10
    1.09
    il
    1.08
     площадь
    1.06
    Act Density 0.025%

    No Known Activations