INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Hãy
    -0.59
     mené
    -0.52
     manqué
    -0.51
    Beck
    -0.50
    Sabes
    -0.50
    jo
    -0.50
    zo
    -0.50
    Dono
    -0.49
    PRIVATE
    -0.49
     bilin
    -0.49
    POSITIVE LOGITS
     disambiguazione
    0.89
    TagMode
    0.80
     nếu
    0.74
    省市镇
    0.74
     eğer
    0.72
     jika
    0.72
    หาก
    0.71
    TestingModule
    0.69
     elseif
    0.69
    nocześnie
    0.68
    Act Density 0.223%

    No Known Activations