INDEX
    Explanations

    signs of complex data structures or formatted content, such as code or documents with specific syntactic elements

    New Auto-Interp
    Negative Logits
     المعيارى
    -1.50
    <unused41>
    -1.48
    <unused74>
    -1.48
    <unused28>
    -1.48
    <unused52>
    -1.48
    <unused8>
    -1.48
    <unused79>
    -1.48
    <unused14>
    -1.47
    <unused3>
    -1.47
    [@BOS@]
    -1.47
    POSITIVE LOGITS
    .
    0.54
    ↵↵
    0.51
    0.44
     for
    0.42
    ,
    0.42
    i
    0.40
    2
    0.39
     with
    0.37
    1
    0.36
     level
    0.36
    Act Density 0.113%

    No Known Activations