INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     endemic
    -0.07
    oxide
    -0.07
     fácil
    -0.07
     completing
    -0.06
     proj
    -0.06
     noct
    -0.06
     Universities
    -0.06
     tightly
    -0.06
    	err
    -0.06
     предпоч
    -0.06
    POSITIVE LOGITS
     substantive
    0.13
    ерь
    0.07
    UPLOAD
    0.07
    ("/{
    0.06
    0.06
    ाइक
    0.06
     unsub
    0.06
    y
    0.06
    :list
    0.06
     amber
    0.06
    Act Density 0.017%

    No Known Activations