INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    al
    0.94
    0.93
    𝑠
    0.90
    0.90
    ist
    0.89
    ReLU
    0.85
     Szcz
    0.85
    𝑚
    0.85
    ore
    0.84
     Cz
    0.84
    POSITIVE LOGITS
     comité
    1.21
     outubro
    1.10
     recorr
    1.08
     setembro
    1.08
     revenus
    1.08
     constamment
    1.07
     diverses
    1.06
    шымта
    1.06
     frontière
    1.05
     sélection
    1.03
    Act Density 0.001%

    No Known Activations