INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     kış
    -0.06
     чому
    -0.06
    ğa
    -0.06
    -star
    -0.06
    note
    -0.06
    -0.06
    601
    -0.06
    ителя
    -0.06
     Blocks
    -0.06
    skills
    -0.06
    POSITIVE LOGITS
     Та
    0.07
    (mask
    0.07
    0.06
     Maven
    0.06
    (Paint
    0.06
    ตะว
    0.06
    	Log
    0.06
     duplicates
    0.06
    دام
    0.06
     oracle
    0.06
    Act Density 0.023%

    No Known Activations