INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Typ
    -0.07
     onay
    -0.06
    >The
    -0.06
    その他
    -0.06
    -0.06
     Sử
    -0.06
     yeni
    -0.06
    สนาม
    -0.06
     있던
    -0.06
     conspicuous
    -0.06
    POSITIVE LOGITS
    -num
    0.07
    dbuf
    0.07
     SS
    0.06
    	array
    0.06
     arbitration
    0.06
    0.06
     Р
    0.06
     Вік
    0.06
     warfare
    0.06
    -wrap
    0.06
    Act Density 0.002%

    No Known Activations