INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     adjustments
    -0.09
    ిప
    -0.08
    ilingual
    -0.08
     reasoning
    -0.08
    oston
    -0.08
    ిమాన
    -0.08
     הנ
    -0.07
     счита
    -0.07
     sentiment
    -0.07
    νων
    -0.07
    POSITIVE LOGITS
    .begin
    0.10
     comienza
    0.09
     comenzar
    0.09
    begin
    0.09
     begins
    0.09
     börjar
    0.09
     Aruba
    0.09
     alkaa
    0.09
     Begins
    0.09
     rozpoc
    0.08
    Act Density 0.216%

    No Known Activations