INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     初始化
    -0.07
    series
    -0.07
    .uniform
    -0.07
     sill
    -0.06
     akan
    -0.06
     этого
    -0.06
    اليا
    -0.06
     яке
    -0.06
     некотор
    -0.06
     pull
    -0.06
    POSITIVE LOGITS
    0.06
    lasyon
    0.06
     paperback
    0.06
     dmg
    0.06
     Swimming
    0.06
    mov
    0.06
     bapt
    0.06
    odule
    0.06
    0.06
    ATUS
    0.06
    Act Density 0.008%

    No Known Activations