INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    كان
    -0.93
    дете
    -0.91
    ПА
    -0.88
     دے
    -0.88
    -0.88
     rozpoczę
    -0.87
    ρυσ
    -0.86
    Ogni
    -0.85
    anggaran
    -0.82
    -0.82
    POSITIVE LOGITS
    return
    1.59
     return
    1.50
     finish
    1.36
    fclose
    1.31
     exit
    1.24
     finished
    1.23
     end
    1.21
     ending
    1.19
     terminar
    1.11
     finishing
    1.07
    Act Density 0.004%

    No Known Activations