INDEX
    Explanations

    references to specific topics or entities in a structured format, such as identification phrases or definitions

    technical and mathematical text segments, particularly those starting with "For" or mathematical formulas.

    New Auto-Interp
    Negative Logits
     ujednoznacz
    -0.98
    <unused68>
    -0.94
    <unused51>
    -0.93
    <unused74>
    -0.93
    <unused14>
    -0.93
    <unused8>
    -0.93
    <unused28>
    -0.93
    <unused41>
    -0.93
    <unused23>
    -0.93
    [@BOS@]
    -0.93
    POSITIVE LOGITS
    In
    0.34
     EClass
    0.31
    The
    0.29
    Self
    0.29
    "…
    0.28
    On
    0.28
    “…
    0.27
     In
    0.27
     В
    0.26
     З
    0.26
    Act Density 0.643%

    No Known Activations