INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ri
    0.98
    ra
    0.95
    rt
    0.93
    lu
    0.89
    u
    0.89
    lighting
    0.88
    ln
    0.86
    lin
    0.85
    lat
    0.82
    wara
    0.82
    POSITIVE LOGITS
    TL
    1.35
    R
    1.13
    ش
    1.06
    D
    0.99
     TL
    0.95
    M
    0.95
    N
    0.95
     którym
    0.89
    م
    0.89
    0.86
    Act Density 0.002%

    No Known Activations