INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    1
    1.50
    s
    1.24
     There
    1.23
    ad
    1.16
    b
    1.16
     and
    1.14
     It
    1.12
    et
    1.11
     a
    1.08
    c
    1.08
    POSITIVE LOGITS
    1.09
    1.07
    もら
    0.92
     یورو
    0.89
    0.86
    padă
    0.86
    0.85
     ಸಂಧಿ
    0.83
     ಹಲವಾರು
    0.82
     في
    0.79
    Act Density 0.361%

    No Known Activations