INDEX
    Explanations

    terms and phrases related to relational connections or comparisons in structured data

    New Auto-Interp
    Negative Logits
    <unused41>
    -0.76
    [@BOS@]
    -0.76
    <unused43>
    -0.76
    <unused74>
    -0.76
    <unused52>
    -0.76
    <unused42>
    -0.75
    <unused68>
    -0.75
    <unused28>
    -0.75
    <unused8>
    -0.75
    <unused14>
    -0.75
    POSITIVE LOGITS
     but
    0.47
     only
    0.39
     regardless
    0.37
     then
    0.36
     and
    0.33
     inderdaad
    0.32
    但不
    0.31
     indeed
    0.31
     without
    0.30
     However
    0.30
    Act Density 0.073%

    No Known Activations