INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Describe
    -0.07
    	conf
    -0.07
     системи
    -0.06
     grav
    -0.06
     різ
    -0.06
    (out
    -0.06
     loosen
    -0.06
            
    -0.06
     sober
    -0.06
             
    -0.06
    POSITIVE LOGITS
     rental
    0.09
     replacement
    0.09
     Rental
    0.08
    replacement
    0.08
     rentals
    0.08
    rial
    0.07
    integral
    0.07
    ματα
    0.07
    unta
    0.07
    ениях
    0.07
    Act Density 0.004%

    No Known Activations