INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    fel
    -0.09
    lated
    -0.08
    Fel
    -0.08
     childhood
    -0.08
     Mensch
    -0.08
     cilvē
    -0.08
    irse
    -0.08
    Pairs
    -0.08
     Needless
    -0.08
     празд
    -0.08
    POSITIVE LOGITS
     results
    0.09
     Ergebnisse
    0.08
     Results
    0.08
     interval
    0.08
     resultados
    0.08
     risultati
    0.08
     intervals
    0.08
     overcame
    0.07
     overcome
    0.07
     overcoming
    0.07
    Act Density 0.002%

    No Known Activations