INDEX
    Explanations

    elements related to coding or programming instructions, particularly in a bilingual context

    New Auto-Interp
    Negative Logits
    <unused21>
    -0.78
    <unused71>
    -0.77
    <unused40>
    -0.77
    <unused52>
    -0.77
    <unused14>
    -0.77
    <unused3>
    -0.77
    <unused17>
    -0.77
    <unused51>
    -0.77
    <pad>
    -0.77
    [@BOS@]
    -0.77
    POSITIVE LOGITS
     и
    0.56
     в
    0.47
     на
    0.46
     для
    0.42
     или
    0.42
     —
    0.38
     с
    0.37
    е
    0.37
     от
    0.36
     это
    0.36
    Act Density 0.175%

    No Known Activations