INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     CIT
    -0.07
    Treat
    -0.07
     inheritance
    -0.07
     іст
    -0.07
    Sf
    -0.07
     запр
    -0.07
     infra
    -0.07
    -0.07
     Dmit
    -0.07
    Inheritance
    -0.07
    POSITIVE LOGITS
     curated
    0.10
     curate
    0.09
     lọ
    0.09
    Voc
    0.08
     inan
    0.08
     lựa
    0.08
    0.08
     നടക്ക
    0.07
     Individuals
    0.07
     mutants
    0.07
    Act Density 0.075%

    No Known Activations