INDEX
    Explanations

    numerical values, particularly those indicating dates or quantities

    New Auto-Interp
    Negative Logits
     with
    -0.28
     while
    -0.27
     instead
    -0.26
     although
    -0.25
     by
    -0.23
    rinha
    -0.23
     Vorschlag
    -0.23
     so
    -0.21
     zero
    -0.20
     most
    -0.20
    POSITIVE LOGITS
     '\\;'
    0.88
    <unused74>
    0.80
    <unused14>
    0.80
    [@BOS@]
    0.80
    <unused52>
    0.80
    <unused41>
    0.80
    <unused43>
    0.80
    <unused42>
    0.80
    <unused16>
    0.80
    <unused28>
    0.80
    Act Density 0.031%

    No Known Activations