INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     mangel
    -0.09
     gebrek
    -0.08
     spiel
    -0.08
     onderdeel
    -0.07
    playing
    -0.07
    不好
    -0.07
     playful
    -0.07
    	X
    -0.07
    кс
    -0.07
     droog
    -0.07
    POSITIVE LOGITS
     preoc
    0.08
     anymore
    0.08
    פת
    0.08
    0.08
    [{
    0.08
    ùa
    0.08
    <Pair
    0.08
     costly
    0.08
     necessariamente
    0.08
     coûte
    0.08
    Act Density 0.033%

    No Known Activations