INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ايا
    1.47
     commemorate
    1.28
    ه
    1.25
    dür
    1.24
    န်
    1.22
    veel
    1.18
    1.17
     transversely
    1.16
    neho
    1.15
    1.15
    POSITIVE LOGITS
    K
    1.15
     K
    1.08
    उप
    1.04
     оби
    1.00
    ers
    0.99
     Hoa
    0.98
    Wrapping
    0.94
     up
    0.93
    ように
    0.92
    0.88
    Act Density 0.080%

    No Known Activations