INDEX
    Explanations

    xml tags and math notation

    New Auto-Interp
    Negative Logits
     devolución
    0.53
    0.53
     đào
    0.52
     loja
    0.52
     руба
    0.52
     mân
    0.51
     возле
    0.51
    0.51
    0.51
    ắn
    0.50
    POSITIVE LOGITS
    0.60
     \
    0.56
    		
    0.56
     whose
    0.55
    L
    0.55
     <
    0.54
     trivially
    0.54
     {
    0.53
                    
    0.53
     functor
    0.52
    Act Density 0.009%

    No Known Activations