INDEX
    Explanations

    expressions indicating conclusions or summaries

    New Auto-Interp
    Negative Logits
     un
    -0.36
     du
    -0.36
     "
    -0.35
     an
    -0.35
     and
    -0.35
     ab
    -0.34
     to
    -0.34
     Now
    -0.33
     ex
    -0.30
     (
    -0.30
    POSITIVE LOGITS
    rungsseite
    1.02
    Hentet
    1.02
    0.91
     '\\;'
    0.90
    ViewFeatures
    0.87
     utafitiHapana
    0.84
     geweſen
    0.81
    <unused17>
    0.79
    [@BOS@]
    0.79
    <unused3>
    0.79
    Act Density 0.053%

    No Known Activations