INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     והמ
    -0.08
     surtout
    -0.08
    ്യൂ
    -0.08
    ാണ്
    -0.08
     soprattutto
    -0.08
     trú
    -0.07
    pletion
    -0.07
     dud
    -0.07
     fighter
    -0.07
    Deviation
    -0.07
    POSITIVE LOGITS
     numbering
    0.09
     numbered
    0.09
     clientes
    0.08
     tasks
    0.08
    ા�
    0.08
     ગ્રાહ
    0.08
     клиентов
    0.08
     заказ
    0.08
     alphabet
    0.08
     alphabetical
    0.08
    Act Density 0.028%

    No Known Activations