INDEX
    Explanations

    the presence of verbs and indications of actions performed

    multilingual word endings

    New Auto-Interp
    Negative Logits
    a
    -0.29
    -0.27
    -0.24
    y
    -0.23
     Sternen
    -0.23
     or
    -0.23
     no
    -0.23
    c
    -0.22
     and
    -0.22
    h
    -0.21
    POSITIVE LOGITS
     zwiſchen
    1.28
    iſchen
    1.28
    [@BOS@]
    1.27
    <unused14>
    1.27
    <unused79>
    1.27
    niſſe
    1.27
    <unused74>
    1.27
    <unused43>
    1.27
    <unused28>
    1.27
    <unused41>
    1.27
    Act Density 0.014%

    No Known Activations