INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    	filter
    -0.07
     στους
    -0.07
    pedido
    -0.06
    ide
    -0.06
    "That
    -0.06
    IDE
    -0.06
     seaside
    -0.06
    .assertEquals
    -0.06
    -0.06
     причины
    -0.06
    POSITIVE LOGITS
    awks
    0.06
     evil
    0.06
     renown
    0.06
    .sex
    0.06
    atology
    0.06
     preceding
    0.06
    \Helpers
    0.06
    0.06
     sanat
    0.06
    ITS
    0.06
    Act Density 0.007%

    No Known Activations