INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     C
    -0.09
    id
    -0.08
     X
    -0.08
    .render
    -0.07
     proficient
    -0.07
    (?)
    -0.07
     render
    -0.07
    uffs
    -0.07
     avoiding
    -0.07
     generate
    -0.07
    POSITIVE LOGITS
     hagati
    0.08
     laarin
    0.08
     Lut
    0.08
     запис
    0.08
     Marley
    0.08
     başlam
    0.08
     kugirango
    0.08
     eder
    0.08
    	holder
    0.08
    -mini
    0.08
    Act Density 0.001%

    No Known Activations