INDEX
    Explanations

    reasoning and conditions

    New Auto-Interp
    Negative Logits
     якщо
    0.34
    अगर
    0.33
    Wenn
    0.32
     wenn
    0.31
     если
    0.31
     jeśli
    0.30
    }>
    0.29
     अगर
    0.29
     Eğer
    0.29
     quando
    0.29
    POSITIVE LOGITS
     этой
    0.34
     এই
    0.34
    нето
    0.33
     this
    0.32
    <unused2197>
    0.32
    a
    0.31
     việc
    0.31
     inilah
    0.31
     λοι
    0.31
     we
    0.30
    Act Density 0.329%

    No Known Activations