INDEX
    Explanations

    phrases indicating explanations or messages related to processes and definitions

    New Auto-Interp
    Negative Logits
    transQ
    -1.08
    posedge
    -0.86
     ddelwed
    -0.85
     queſta
    -0.85
    ſicht
    -0.81
    iſen
    -0.81
    ſammen
    -0.80
    ſehen
    -0.79
    <unused68>
    -0.79
    <unused8>
    -0.79
    POSITIVE LOGITS
    Being
    0.30
    ↵↵
    0.30
    Finding
    0.29
     Not
    0.29
    Not
    0.29
    The
    0.28
     Cha
    0.28
    Sadly
    0.27
     Th
    0.27
     The
    0.27
    Act Density 0.000%

    No Known Activations