INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ۰۰
    1.88
    elem
    1.54
    ्स
    1.54
    ة
    1.48
    pyrimidine
    1.38
    ০০০
    1.36
    <unused702>
    1.34
    tin
    1.30
    Pedidos
    1.28
    satz
    1.26
    POSITIVE LOGITS
    ра
    1.57
    il
    1.54
    wheeled
    1.41
    š
    1.38
    pping
    1.34
    quela
    1.30
    в
    1.29
    п
    1.25
    ş
    1.25
     sammen
    1.23
    Act Density 0.153%

    No Known Activations