INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    pare
    -0.07
     pří
    -0.06
    Displays
    -0.06
     fundament
    -0.06
     biểu
    -0.06
    (pl
    -0.06
     taş
    -0.06
    	free
    -0.06
     bfs
    -0.06
    -0.06
    POSITIVE LOGITS
     shrinking
    0.07
     shrink
    0.07
    roman
    0.06
     Shutdown
    0.06
    0.06
     hudeb
    0.06
    wik
    0.06
    ных
    0.06
     adjustments
    0.06
     suffers
    0.06
    Act Density 0.004%

    No Known Activations