INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    خ
    1.07
    قت
    1.05
    Ди
    1.01
    Д
    0.99
    Си
    0.98
    Мо
    0.95
    RE
    0.94
    '
    0.93
    Ре
    0.92
    corr
    0.91
    POSITIVE LOGITS
    ль
    1.31
    ıcı
    1.27
    in
    1.21
    en
    1.21
    時候
    1.17
     πάντα
    1.16
    también
    1.16
    ция
    1.14
    ният
    1.14
    1.13
    Act Density 0.542%

    No Known Activations