INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     }}}{
    -0.56
     genoux
    -0.54
     vervol
    -0.54
     épaules
    -0.53
     détru
    -0.52
    kaŭ
    -0.52
     varandra
    -0.52
     }}$}
    -0.51
     δὲ
    -0.51
     récompense
    -0.51
    POSITIVE LOGITS
     Express
    0.56
    InitVars
    0.56
    <bos>
    0.53
    WithIOException
    0.53
    Express
    0.51
    estacks
    0.50
     EconPapers
    0.47
    Dış
    0.47
     "..\..\
    0.47
     xen
    0.47
    Act Density 0.032%

    No Known Activations