INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     dos
    -0.09
     livelihood
    -0.08
     União
    -0.08
    ardu
    -0.07
    ුණු
    -0.07
     incentives
    -0.07
     prog
    -0.07
    dos
    -0.07
     schemas
    -0.07
     komand
    -0.07
    POSITIVE LOGITS
    /back
    0.08
     Leroy
    0.08
     frustr
    0.08
     llor
    0.08
     Booking
    0.08
     Geometry
    0.07
    GLISH
    0.07
     solver
    0.07
     pierwszy
    0.07
     sijait
    0.07
    Act Density 0.002%

    No Known Activations