INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     유명
    -0.10
    amanho
    -0.08
    ächen
    -0.08
    ahia
    -0.08
    人気
    -0.08
    isation
    -0.08
     лу
    -0.08
    -0.08
    inals
    -0.07
     сложно
    -0.07
    POSITIVE LOGITS
     bijdrage
    0.09
     contributing
    0.09
    (anim
    0.09
     Beitrag
    0.09
     kontrib
    0.08
    0.08
    贡献
    0.08
     contribuir
    0.08
     groundwork
    0.08
     yadda
    0.08
    Act Density 0.004%

    No Known Activations