INDEX
    Explanations

    use of elements/features

    New Auto-Interp
    Negative Logits
    captures
    -0.07
     але
    -0.06
     वह
    -0.06
     Sikh
    -0.06
     표시
    -0.06
     tangled
    -0.06
     پسر
    -0.06
     شامل
    -0.06
     имеют
    -0.06
     phê
    -0.06
    POSITIVE LOGITS
    (gt
    0.06
    (scope
    0.06
    _installed
    0.06
    ollow
    0.06
    ibration
    0.06
    σταση
    0.06
    [l
    0.06
    	                           
    0.06
    AMPL
    0.06
    0.06
    Act Density 0.068%

    No Known Activations