INDEX
    Explanations

    explanations and improvements

    New Auto-Interp
    Negative Logits
     manchmal
    0.75
     qualche
    0.74
     примерно
    0.66
     alguno
    0.65
     sometimes
    0.64
     atau
    0.64
     ungef
    0.62
    いずれ
    0.61
     substituir
    0.61
     soms
    0.61
    POSITIVE LOGITS
    allows
    1.08
     allows
    1.07
    Allows
    1.06
    allowing
    1.01
     Allows
    0.96
     позволяет
    0.95
     позволяют
    0.95
     permite
    0.91
     allowing
    0.91
     позволя
    0.88
    Act Density 0.539%

    No Known Activations