INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     pousser
    -0.08
     pousse
    -0.08
    .gz
    -0.08
     cresce
    -0.07
     words
    -0.07
     Sabb
    -0.07
     gratu
    -0.07
     Griechen
    -0.07
    -0.07
    $q
    -0.07
    POSITIVE LOGITS
    iling
    0.08
    utely
    0.08
     Fakt
    0.08
    aisin
    0.08
    attering
    0.07
    ving
    0.07
    ిస
    0.07
    okt
    0.07
    -M
    0.07
    0.07
    Act Density 0.001%

    No Known Activations